Create a Databricks connection to an Edge site
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
Before you begin
- You have created and installed an Edge site.
- You have given the Edge Site role the required permissions.
- Your Databricks access token has the following minimum permissions:
- USE CATALOG permission for catalogs.
- USE SCHEMA permission for schemas.
- SELECT permission for tables/views belonging to the catalogs and schemas.
Note A Databricks access token with the BROWSE permission can also integrate metadata from Databricks Unity Catalog. It doesn't require the USE CATALOG and USE SCHEMA permissions. This feature will be in Beta as long as it's in Public Preview for Databricks Unity Catalog.
Required permissions
- You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.
Steps
- Open an Edge site.
-
On the main toolbar, click
, and then click
Settings.
The Collibra settings page opens. -
In the tab pane, click Edge.
The Sites tab opens and shows a table with an overview of the Edge sites. - In the table, click the name of the Edge site whose status is Healthy.
The Edge site page opens.
-
On the main toolbar, click
, and then click
Settings.
- In the Connections section, click Create connection.
The Create connection page appears. - Enter the required information.
Field Description Required Connection settings
This section contains the general settings of your connection.
NameThe name of the Edge connection for Databricks.
Yes DescriptionThe description of the connection.
No Connection providerThe connection provider, which determines the available connection parameters.
Select Databricks to connect to Databricks.
Yes Connection parameters
This section contains the settings to connect to your data source. Workspace URLEnter the full URL of any Databricks workspace connected to Unity Catalog that you want to integrate.
To retrieve the full URL, log into Databricks and copy the URL, including https://. For example: https://123.cloud.databricks.com.Yes Access TokenThe security token that was generated in Databricks for the workspace.
The access token must be a personal access token (PAT).
It is possible to generate a PAT for service principals. For information on the service principle token, go to the Databricks documentation.NoteThe user / token must have the following minimum permissions:
- USE CATALOG permission for catalogs.
- USE SCHEMA permission for schemas.
- SELECT permission for tables/views belonging to the catalogs and schemas.
A Databricks access token with the BROWSE privilege can also integrate metadata from Databricks Unity Catalog. It doesn't require the USE CATALOG and USE SCHEMA privilege. This feature will be in Beta as long as it's in Public Preview for Databricks Unity Catalog.
Yes Encryption optionsSelect the type of encryption used to store the Secret Access Key.
Default: To be encrypted by Edge management server.
Yes - Click Create.
The connection is added to the Edge site.
What's next?
You can now add the Databricks Unity Catalog capability to an Edge site.
Available vaults
You can use a vault to add your data source information to your Edge site connection. |
None
AWS Secrets Manager
Azure Key Vault
CyberArk Vault
Google Secret Manager
HashiCorp Vault
|
|
|
Prerequisites
- You have created and installed an Edge site.
- You have given the Edge Site role the required permissions.
- You have added a vault to your Edge site.
- If your data source connection requires a file from your vault, the file must be encoded into Base64 and stored as a regular secret in your vault.
- If you want to integrate metadata, your Databricks access token or OAuth Client has the following minimum permissions:
- USE CATALOG permission for catalogs.
- USE SCHEMA permission for schemas.
- SELECT permission for tables/views belonging to the catalogs and schemas.
Note A Databricks access token with the BROWSE permission can also integrate metadata from Databricks Unity Catalog. It doesn't require the USE CATALOG and USE SCHEMA permissions. This feature will be in Beta as long as it's in Public Preview for Databricks Unity Catalog.
-
If you want to integrate Databricks AI models, ensure that your Databricks access token or OAuth Client also has the following permissions:
- EXECUTE permission on the registered model.
- USE CATALOG permission on the parent catalog.
- USE SCHEMA permission on the parent schema.
- You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.
Steps
- Open an Edge site.
-
On the main toolbar, click
, and then click
Settings.
The Collibra settings page opens. -
In the tab pane, click Edge.
The Sites tab opens and shows a table with an overview of the Edge sites. - In the table, click the name of the Edge site whose status is Healthy.
The Edge site page opens.
-
On the main toolbar, click
, and then click
Settings.
- In the Connections section, click Create connection.
- Select Databricks to connect to Databricks.
The Create connection page appears. - Enter the required information.
Field Description Required Name The name of the Edge connection for Databricks.
Yes Description The description of the connection.
No Vault The vault where you store your data source values. No Workspace URL Enter the URL of any Databricks workspace connected to Unity Catalog that you want to integrate.
To retrieve the URL, log into Databricks and copy the URL. For example: https://123.cloud.databricks.com.How to use your vault...To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the query value to identify the secret in your vault.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Engine Type Select one of the following:
- Key Value
- Database
Engine Path The engine path to your vault where the value is stored. Secret Path The secret path to your vault where the value is stored. Field The name of the field to your vault where the value is stored.
Note Only available if you selected Key Value in the Secret Engine Type field.
Role The role specified in the Database engine.
Note Only available if you selected Database in the Secret Engine Type field.
Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Vault Name The name of your Azure Key Vault in your Azure Key Vault service where the value is stored. Secret Name The name of the secret in your vault where the value is stored. Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Name The name of the secret in your vault where the value is stored. Field If the secret stored in your AWS Secrets Manager is a JSON value, for example
{"pass1": "my-password", "pass2": "my-password2"}
, then you need to specify the Field to point to the exact JSON value that should be used. For example,Secret Name: edge-db-customer; Field: pass
.Note If the secret stored in your AWS Secrets Manager is a plain string value, for example
my-password
, then you do not need to specify the Field.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the name of the secret in your vault where the value is stored.
Example
Yes Authentication TypeSelect the type of authentication that you want to apply. You can select Personal Access Token or OAuth.
For information on OAuth-based authentication in Databricks Unity Catalog, go to the Databricks documentation.Yes Access Token The security token that was generated in Databricks for the workspace.
The access token must be a personal access token (PAT). It is possible to generate a PAT for service principals. For information on the service principal token, go to the Databricks documentation.Note- If you want to integrate metadata, the user / token must have the following minimum permissions:
- USE CATALOG permission for catalogs.
- USE SCHEMA permission for schemas.
- SELECT permission for tables/views belonging to the catalogs and schemas.
- If you want to integrate Databricks AI models, make sure that your Databricks access token also has the following permissions:
- EXECUTE permission on the registered model.
- USE CATALOG permission on the parent catalog.
- USE SCHEMA permission on the parent schema.
A Databricks access token with the BROWSE permission can also integrate metadata from Databricks Unity Catalog. It doesn't require the USE CATALOG and USE SCHEMA permissions. This feature will be in Beta as long as it's in Public Preview for Databricks Unity Catalog.
How to use your vault...To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the query value to identify the secret in your vault.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Engine Type Select one of the following:
- Key Value
- Database
Engine Path The engine path to your vault where the value is stored. Secret Path The secret path to your vault where the value is stored. Field The name of the field to your vault where the value is stored.
Note Only available if you selected Key Value in the Secret Engine Type field.
Role The role specified in the Database engine.
Note Only available if you selected Database in the Secret Engine Type field.
Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Vault Name The name of your Azure Key Vault in your Azure Key Vault service where the value is stored. Secret Name The name of the secret in your vault where the value is stored. Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the required information:
Name Description Secret Name The name of the secret in your vault where the value is stored. Field If the secret stored in your AWS Secrets Manager is a JSON value, for example
{"pass1": "my-password", "pass2": "my-password2"}
, then you need to specify the Field to point to the exact JSON value that should be used. For example,Secret Name: edge-db-customer; Field: pass
.Note If the secret stored in your AWS Secrets Manager is a plain string value, for example
my-password
, then you do not need to specify the Field.Example
To use your vault, do the following:- In the Value Type field, select Vault Key.
- Enter the name of the secret in your vault where the value is stored.
Example
Yes, if you select Personal Access Token as the authentication type. Client IDThe client ID for the OAuth-based authentication on Databricks.
For information on OAuth-based authentication in Databricks Unity Catalog, go to the Databricks documentation.Note- If you want to integrate metadata, the OAuth Client must have the following minimum permissions:
- USE CATALOG permission for catalogs.
- USE SCHEMA permission for schemas.
- SELECT permission for tables/views belonging to the catalogs and schemas.
- If you want to integrate Databricks AI models, ensure that the OAuth Client also has the following permissions:
- EXECUTE permission on the registered model.
- USE CATALOG permission on the parent catalog.
- USE SCHEMA permission on the parent schema.
A Databricks access token with the BROWSE permission can also integrate metadata from Databricks Unity Catalog. It doesn't require the USE CATALOG and USE SCHEMA permissions. This feature will be in Beta as long as it's in Public Preview for Databricks Unity Catalog.
Yes, if you select OAuth as the authentication type. Client SecretThe client secret generated for the OAuth-based authentication on Databricks. Yes, if you select OAuth as the authentication type. - Click Create.
The connection is added to the Edge site.
Note Collibra validates the credentials when synchronizing Databricks Unity Catalog.
What's next?
You can now add the Databricks Unity Catalog capability to an Edge site.