Create an Azure Data Lake Storage connection to an Edge site
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
Prerequisites
- In Azure:
- To integrate ADLS folders, you need an Azure Service Principal user that is defined in Azure and that has permissions to list the files which need to be integrated into Collibra. The Azure Service Principal user must have the "Reader" and "Storage Blob Data Reader" roles for the storage locations of your data. For information, go to the Azure documentation.
- If you use Microsoft Purview:
- The Azure Service Principal user must have the "Data reader" role to fetch entities/assets from the Microsoft Purview Rest API. For information, go to the Microsoft Purview documentation.
- If your ADLS storage is private, make sure that the Allow Azure services on the trusted services list to access this storage account checkbox in the Networking → Firewalls and virtual networks is selected.
- To integrate ADLS folders, you need an Azure Service Principal user that is defined in Azure and that has permissions to list the files which need to be integrated into Collibra. The Azure Service Principal user must have the "Reader" and "Storage Blob Data Reader" roles for the storage locations of your data. For information, go to the Azure documentation.
- You have created and installed an Edge site.
- If you have configured a forward proxy for your Edge site and want the integration API calls to bypass this proxy, update the Edge nonProxy property:
- Adding
login.microsoftonline.com
allows the API calls that get access tokens to bypass the proxy. - Adding
dfs.core.windows.net
orblob.core.windows.net
allows the ADLS API calls to bypass the proxy. - Adding
purview.azure.com
allows the Purview APIs to bypass the proxy.
- Adding
- You have given the Edge Site role the required permissions.
- You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.
Steps
- Open an Edge site.
-
On the main toolbar, click
, and then click
Settings.
The Collibra settings page opens. -
In the tab pane, click Edge.
The Sites tab opens and shows a table with an overview of the Edge sites. - In the table, click the name of the Edge site whose status is Healthy.
The Edge site page opens.
-
On the main toolbar, click
, and then click
Settings.
- In the Connections section, click Create connection.
The Create connection page appears. - Enter the required information.
Field Description Required Connection settings
This section contains the general settings of your connection.
NameThe name of the Edge connection for Azure Data Lake Storage.
Yes DescriptionThe description of the connection.
No Connection providerThe connection provider, which determines the available connection parameters.
Select the Azure connection to connect to Azure Data Lake Storage.
Yes Connection parameters
This section contains the settings to connect to your data source. Service Principal IDThe Application account ID to connect to the Azure.
For information on the Azure Service Principal user and the Application ID, go to the Azure documentation.Yes Service Principal SecretThe application secret for the Service Principal.
For information on the application secret value, go to the Azure documentation.Yes Encryption optionsSelect the type of encryption used to store the Secret Access Key.
The default is To be encrypted by Edge management server.
Yes Tenant IDThe Tenant ID of your Azure Active Directory.
For information on the Directory (tenant) ID, go to the Azure documentation.Yes - Click Create.
The connection is added to the Edge site.
What's next?
You can now add the ADLS synchronization capability to an Edge site.
Available vaults
You can use a vault to add your data source information to your Edge site connection. |
None
AWS Secrets Manager
Azure Key Vault
CyberArk Vault
Google Secret Manager
HashiCorp Vault
|
|
|
Prerequisites
- In Azure:
- To integrate ADLS folders, you need an Azure Service Principal user that is defined in Azure and that has permissions to list the files which need to be integrated into Collibra. The Azure Service Principal user must have the "Reader" and "Storage Blob Data Reader" roles for the storage locations of your data. For information, go to the Azure documentation.
- If you use Microsoft Purview:
- The Azure Service Principal user must have the "Data reader" role to fetch entities/assets from the Microsoft Purview Rest API. For information, go to the Microsoft Purview documentation.
- If your ADLS storage is private, make sure that the Allow Azure services on the trusted services list to access this storage account checkbox in the Networking → Firewalls and virtual networks is selected.
- To integrate ADLS folders, you need an Azure Service Principal user that is defined in Azure and that has permissions to list the files which need to be integrated into Collibra. The Azure Service Principal user must have the "Reader" and "Storage Blob Data Reader" roles for the storage locations of your data. For information, go to the Azure documentation.
- You have created and installed an Edge site.
- You have given the Edge Site role the required permissions.
- You have added a vault to your Edge site.
- If your data source connection requires a file from your vault, the file must be encoded into Base64 and stored as a regular secret in your vault.
- If you have configured a forward proxy for your Edge site and want the integration API calls to bypass this proxy, update the Edge nonProxy property:
- Adding
login.microsoftonline.com
allows the API calls that get access tokens to bypass the proxy. - Adding
dfs.core.windows.net
orblob.core.windows.net
allows the ADLS API calls to bypass the proxy. - Adding
purview.azure.com
allows the Purview APIs to bypass the proxy.
- Adding
- You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.
Steps
- Open an Edge site.
-
On the main toolbar, click
, and then click
Settings.
The Collibra settings page opens. -
In the tab pane, click Edge.
The Sites tab opens and shows a table with an overview of the Edge sites. - In the table, click the name of the Edge site whose status is Healthy.
The Edge site page opens.
-
On the main toolbar, click
, and then click
Settings.
- In the Connections section, click Create connection.
The Create connection page appears. - Select the Azure connection to connect to Azure Data Lake Storage.
- Enter the required information.
Field Description Required Name The name of the Edge connection for Azure Data Lake Storage.
Yes Description The description of the connection.
No Vault The vault where you store your data source values. No Service Principal ID The Application account ID to connect to the Azure.
For information on the Azure Service Principal user and the Application ID, go to the Azure documentation.How to use your vault...To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the query value to identify the secret in your vault.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Engine Type Select one of the following:
- Key Value
- Database
Engine Path The engine path to your vault where the value is stored. Secret Path The secret path to your vault where the value is stored. Field The name of the field to your vault where the value is stored.
Note Only available if you selected Key Value in the Secret Engine Type field.
Role The role specified in the Database engine.
Note Only available if you selected Database in the Secret Engine Type field.
Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Vault Name The name of your Azure Key Vault in your Azure Key Vault service where the value is stored. Secret Name The name of the secret in your vault where the value is stored. Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Name The name of the secret in your vault where the value is stored. Field If the secret stored in your AWS Secrets Manager is a JSON value, for example
{"pass1": "my-password", "pass2": "my-password2"}
, then you need to specify the Field to point to the exact JSON value that should be used. For example,Secret Name: edge-db-customer; Field: pass
.Note If the secret stored in your AWS Secrets Manager is a plain string value, for example
my-password
, then you do not need to specify the Field.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the name of the secret in your vault where the value is stored.
Example
Yes Service Principal Secret The application secret for the Service Principal.
For information on the application secret value, go to the Azure documentation.How to use your vault...To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the query value to identify the secret in your vault.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Engine Type Select one of the following:
- Key Value
- Database
Engine Path The engine path to your vault where the value is stored. Secret Path The secret path to your vault where the value is stored. Field The name of the field to your vault where the value is stored.
Note Only available if you selected Key Value in the Secret Engine Type field.
Role The role specified in the Database engine.
Note Only available if you selected Database in the Secret Engine Type field.
Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Vault Name The name of your Azure Key Vault in your Azure Key Vault service where the value is stored. Secret Name The name of the secret in your vault where the value is stored. Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Name The name of the secret in your vault where the value is stored. Field If the secret stored in your AWS Secrets Manager is a JSON value, for example
{"pass1": "my-password", "pass2": "my-password2"}
, then you need to specify the Field to point to the exact JSON value that should be used. For example,Secret Name: edge-db-customer; Field: pass
.Note If the secret stored in your AWS Secrets Manager is a plain string value, for example
my-password
, then you do not need to specify the Field.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the name of the secret in your vault where the value is stored.
Example
Yes Tenant ID The Tenant ID of your Azure Active Directory.
For information on the Directory (tenant) ID, go to the Azure documentation.How to use your vault...To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the query value to identify the secret in your vault.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Engine Type Select one of the following:
- Key Value
- Database
Engine Path The engine path to your vault where the value is stored. Secret Path The secret path to your vault where the value is stored. Field The name of the field to your vault where the value is stored.
Note Only available if you selected Key Value in the Secret Engine Type field.
Role The role specified in the Database engine.
Note Only available if you selected Database in the Secret Engine Type field.
Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Vault Name The name of your Azure Key Vault in your Azure Key Vault service where the value is stored. Secret Name The name of the secret in your vault where the value is stored. Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Name The name of the secret in your vault where the value is stored. Field If the secret stored in your AWS Secrets Manager is a JSON value, for example
{"pass1": "my-password", "pass2": "my-password2"}
, then you need to specify the Field to point to the exact JSON value that should be used. For example,Secret Name: edge-db-customer; Field: pass
.Note If the secret stored in your AWS Secrets Manager is a plain string value, for example
my-password
, then you do not need to specify the Field.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the name of the secret in your vault where the value is stored.
Example
Yes - Click Create.
The connection is added to the Edge site.
What's next?
You can now add the ADLS synchronization capability to an Edge site.