Create a Databricks connection to an Edge or Collibra Cloud site

Important 

Choose an option below to explore the documentation for the latest user interface (UI) or the classic UI.

Prerequisites

In your Collibra environment

In your Databricks environment

  • Your Databricks access token must have the BROWSE permission on the catalogs in Databricks Unity Catalog from which you want to integrate metadata. For more information on the BROWSE permission, go to the Databricks documentation.
  • If you want to integrate source tags, additional permissions are needed.
    • The metadata synchronization for Databricks Unity Catalog uses compute clusters (SQL query compute warehouse) to collect source tags. To allow this, grant the following permissions:
      • CAN ATTACH TO
      • CAN RESTART
      During the synchronization configuration, you can define that the compute clusters must stop after the source tags are extracted.
    • To integrate source tags from specific tables in system.information_schema, grant the following permissions:
      • USE CATALOG permission on system catalog
      • USE SCHEMA permission on system.information_schema
    • SELECT permission on the following:
      • system.information_schema.catalog_tags
      • system.information_schema.schema_tags
      • system.information_schema.table_tags
      • system.information_schema.column_tags

Steps

  1. Open a site.
    1. On the main toolbar, click Products iconCogwheel icon Settings.
      The Settings page opens.
    2. In the tab pane, click Edge.
      The Sites tab opens and shows a table with an overview of your sites.
    3. In the table, click the name of the site whose status is Healthy.
      The site page opens.
  2. In the Connections section, click Create connection.
    The Create connection page appears.
  3. Enter the required information.
    FieldDescriptionRequired

    Connection settings

    This section contains the general settings of your connection.

    Name

    The name of the Edge connection for Databricks.

    Yes
    Description

    The description of the connection.

    No
    Connection provider

    The connection provider, which determines the available connection parameters.

    Select Databricks to connect to Databricks.

    Yes

    Connection parameters

    This section contains the settings to connect to your data source.
    Workspace URL

    Enter the full URL of any Databricks workspace connected to Unity Catalog that you want to integrate.
    To retrieve the full URL, log into Databricks and copy the URL, including https://. For example: https://123.cloud.databricks.com.

    Yes
    Access Token

    The security token that was generated in Databricks for the workspace.

    The access token must be a personal access token (PAT).
    It is possible to generate a PAT for service principals. For information on the service principal token, go to the Databricks documentation.

    Note Ensure that your Databricks access token has been granted the required permissions in your Databricks environment.

    Yes
    Encryption options

    Select the type of encryption used to store the Secret Access Key.

    Default: To be encrypted by Edge management server.

    Yes
  4. Click Create.
    The connection is added to the Edge or Collibra Cloud site.

What's next

You can now add the Databricks Unity Catalog capability to an Edge or Collibra Cloud site.

Available vaults

Tip 

You can use a vault to add your data source information to your Edge site connection.

Vaults are not available for Collibra Cloud site sites.

None
AWS Secrets Manager
Azure Key Vault
CyberArk Vault
Google Secret Manager
HashiCorp Vault
 

Prerequisites

In your Collibra environment

  • You either created and installed an Edge site or were granted a Collibra Cloud site.
  • You have added a vault to your Edge site.
    Note  Vaults are not supported on Collibra Cloud sites.
  • If your data source connection requires a file from your vault, the file must be encoded into Base64 and stored as a regular secret in your vault.
  • You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.

In your Databricks environment

  • Your Databricks access token or OAuth client must have the BROWSE permission on the catalogs in Databricks Unity Catalog from which you want to integrate metadata. For more information on the BROWSE permission, go to the Databricks documentation.

  • If you want to integrate source tags, additional permissions are needed.
    • The metadata synchronization for Databricks Unity Catalog uses compute clusters (SQL query compute warehouse) to collect source tags. To allow this, grant the following permissions:
      • CAN ATTACH TO
      • CAN RESTART
      During the synchronization configuration, you can define that the compute clusters must stop after the source tags are extracted.
    • To integrate source tags from specific tables in system.information_schema, grant the following permissions:
      • USE CATALOG permission on system catalog
      • USE SCHEMA permission on system.information_schema
    • SELECT permission on the following:
      • system.information_schema.catalog_tags
      • system.information_schema.schema_tags
      • system.information_schema.table_tags
      • system.information_schema.column_tags
  • If you want to integrate Databricks AI models, ensure that your Databricks access token or OAuth client also has the following permissions:

    • EXECUTE permission on the registered model.
    • USE CATALOG permission on the parent catalog.
    • USE SCHEMA permission on the parent schema.

Steps

  1. Open a site.
    1. On the main toolbar, click Products iconCogwheel icon Settings.
      The Settings page opens.
    2. In the tab pane, click Edge.
      The Sites tab opens and shows a table with an overview of your sites.
    3. In the table, click the name of the site whose status is Healthy.
      The site page opens.
  2. In the Connections section, click Create connection.
  3. Select Databricks to connect to Databricks.
    The Create connection page appears.
  4. Enter the required information.
    FieldDescriptionRequired
    Name

    The name of the Edge or Collibra Cloud site connection for Databricks.

    Yes
    Description

    The description of the connection.

    No
    Vault The vault where you store your data source values. No
    Workspace URL

    Enter the URL of any Databricks workspace connected to Unity Catalog that you want to integrate.
    To retrieve the URL, log into Databricks and copy the URL. For example: https://123.cloud.databricks.com.

    Yes
    Authentication Type

    Select the type of authentication that you want to apply. You can select any of the following values:

    Yes
    Access Token

    The security token that was generated in Databricks for the workspace. The access token must be a personal access token (PAT).
    It is possible to generate a PAT for service principals. For information on the service principal token, go to the Databricks documentation.

    Note Ensure that your Databricks access token has been granted the required permissions in your Databricks environment.

    Yes, if you select Personal Access Token as the authentication type.
    Client ID

    The client ID for OAuth-based authentication in Databricks, or the client ID of the Microsoft Entra ID service principal.

    For information on OAuth-based authentication in Databricks Unity Catalog, go to the Databricks documentation.

    For information on the Microsoft Entra ID service principal, go to Microsoft Entra service principal authentication in the Azure Databricks documentation.

    Note Ensure that your Databricks OAuth client or Microsoft Entra ID service principal has been granted the required permissions in your Databricks environment.

    Yes, if you select OAuth or Microsoft Entra ID as the authentication type.
    Client Secret

    The client secret generated for the OAuth-based authentication on Databricks, or the client secret of the Microsoft Entra ID service principal.

    Yes, if you selectOAuth or Microsoft Entra ID as the authentication type.
    Tenant ID

    The Directory (tenant) ID for the related application registered in Microsoft Entra ID.

    For information, go to MS Entra service principal authentication in the Azure Databricks documentation.

    Yes, if you select Microsoft Entra ID as the authentication type.
  5. Click Create.
    The connection is added to the Edge or Collibra Cloud site.

Note Collibra validates the credentials when synchronizing Databricks Unity Catalog.

What's next

If you want to allow for sampling, profiling, and classification, create a Databricks JDBC connection. If you created a Databricks JDBC connection, you can use that JDBC connection when you configure the synchronization page.

You can then add the Databricks Unity Catalog capability to an Edge or Collibra Cloud site.