Connecting to Azure Blob Storage

This section contains an overview of Azure Blob Storage.

General information

Field Description
Data source Azure Blob Storage
Supported versions N/A
Connection string wasbs://
Packaged?

Yes

Certified?

Yes

Supported features
Analyze data

Yes

Archive breaking records

Yes

Estimate job

Yes

Pushdown

No

Processing capabilities
Spark agent

Yes

Yarn agent

Yes

Minimum user permissions

In order for Collibra DQ to access your Azure Blob Storage containers, you need the following user permissions:

  • ROLE_ADMIN in Collibra DQ.
  • Read access on your Blob Storage containers to create a basic Azure Blob connection.
  • Read and write access to the Blob Storage containers where breaking records are archived from Collibra DQ. This is only necessary if you use the archive breaking records feature.

Recommended and required connection properties

Required Connection Property Type Value

Yes

Name Text The unique name used for your connection.

Yes

Connection URL String

The connection string value of your ADLS connection.

wasbs://<container>@<account name>.dfs.core.windows.net/<path>/file name>

Tip See the Connection URL elements section for more information on the syntax.

No

Target Agent Text The Agent used to submit your DQ Job.

Yes

Auth Type Option

The method to authenticate your connection.

Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types.

No

Archive Breaking Records Location Option When you select this option and have followed the steps, you can export CSV files to the Azure Blob location.

No

Driver Properties String

The configurable driver properties for your connection. Multiple properties must be comma delimited. For example, abc=123,test=true

Connection URL elements

Field Description
File scheme The wasbs protocol is used as the scheme identifier.
Container The parent location that holds the files and folders. This is the same as file system in the Azure Data Lake Storage service.
Account name The name given to your storage account during creation.
Path A forward slash delimited / representation of the directory structure.
File name The name of the individual file. This parameter is optional when you address a directory.

Authentication

Select an authentication type from the dropdown menu. The options available in the dropdown menu are the currently supported authentication types for this data source.

Field Description
Storage Account The Azure Storage account name.
Key The authentication key for the storage account.
Storage Account The Azure Storage account name.
Shared Access Signature The string value of the Azure Blob shared access signature.

Known limitations

  • There is a limitation with Azure Blob connections where a missing folder error occurs when you try to load files from Explorer.
    • A possible one-time workaround is to create the /opt/owl/previewcache folder before you load files from your Azure Blob connection.
  • Currently, the WASBS driver is not supported in Standalone Collibra DQ environments. However, support for WASBS is planned for an upcoming Collibra DQ release.