Connecting to Azure Blob Storage
This section contains an overview of Azure Blob Storage.
General information
Azure Blob Storage is the object store for different unstructured data such as text, video, and audio and uses flat object storage. The WASBS driver is used for Azure Blob Storage.
| Field | Description |
|---|---|
| Data source | Azure Blob Storage |
| Supported versions | N/A |
| Connection string | wasbs://
|
| Packaged? |
|
| Certified? |
|
| Supported features | |
| Pushdown |
|
| Estimate job |
|
| Filtergram |
|
| Analyze data |
|
| Spark agent |
|
| Yarn agent |
|
Recommended and required connection properties
| Required | Connection Property | Type | Value |
|---|---|---|---|
|
|
Name | Text | The unique name used for your connection. |
|
|
Connection URL | String |
The connection string value of your ADLS connection. wasbs://<container>@<account name>.dfs.core.windows.net/<path>/file name> Tip See the Connection URL elements section for more information on the syntax. |
|
|
Target Agent | Text | The Agent used to submit your DQ Job. |
|
|
Auth Type | Option |
The method to authenticate your connection. Note The configuration requirements are different depending on the Auth Type you select. See Authentication for more details on available authentication types. |
|
|
Driver Properties | String | The configurable driver properties for your connection. |
Minimum user permissions
To create an Azure Blob Storage connection, you need to have ROLE_ADMIN assigned to you in Collibra DQ and access to Azure Blob Storage.
Authentication
Select one of the following authentication options:
| Field | Description |
|---|---|
| Storage Account | The Azure Storage account name. |
| Key | The authentication key for the storage account. |
| Storage Account | The Azure Storage account name. |
| Shared Access Signature | The string value of the Azure Blob shared access signature. |
Connection URL elements
| Field | Description |
|---|---|
| File scheme | The wasbs protocol is used as the scheme identifier. |
| Container | The parent location that holds the files and folders. This is the same as file system in the Azure Data Lake Storage service. |
| Account name | The name given to your storage account during creation. |
| Path | A forward slash delimited / representation of the directory structure. |
| File name | The name of the individual file. This parameter is optional when you address a directory. |