Set up Protect

Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

This documentation describes how to set up Protect and establish a connection between your data source and Protect.

Tip 

The information in this documentation varies depending on the data source you select below.

Before you begin

  1. Download the JDBC driver for Amazon Athena.
  2. Create a JDBC connection from your Edge site to Amazon Athena.
    Tip When creating the connection, in the Connection provider field, select Generic JDBC connection. In the Connection propertiesProperty section, set the IncludeTableTypes connection property to true. This property creates a distinction between tables and views in the ingested metadata, creating Table assets and View assets in Collibra. If the property is set to false, the metadata is ingested as Table assets.
  3. Add the Catalog JDBC ingestion capability to the Edge site.
    Tip 

    When adding the capability:

    • In the Capability template field, selectSelect Catalog JDBC Ingestion.
    • In the JDBC Connection field, select the connection that you created in Step 2.
  4. Register and synchronize the data source.
Tip The following image shows an ingested AWS Lake Formation database. The Data Source Type attribute containing the value Amazon Athena is added to the database asset only after the Catalog JDBC ingestion process is complete.

Image of an ingested Athena database

Image of an ingested Athena database

  1. Download the JDBC driver for Google BigQuery.
  2. Create a JDBC connection from your Edge site to Google BigQuery.
    Tip When creating the connection, in the Connection provider field, select Generic JDBC connection. In the Connection propertiesProperty section, set the value of the Other connection property to SupportNativeDataType=True.
  3. Add the Catalog JDBC ingestion capability to the Edge site.
    Tip 

    When adding the capability:

    • In the Capability template field, selectSelect Catalog JDBC Ingestion.
    • In the JDBC Connection field, select the connection that you created in Step 2.
  4. Register and synchronize the data source.
Tip The following image shows an ingested BigQuery database. The Data Source Type attribute containing the value Google BigQuery is added to the database asset only after the Catalog JDBC ingestion process is complete.

Ingested BigQuery database

Ingested BigQuery database

Watch a video

Note The following video was recorded in the classic user interface.

  1. Ensure that the following setting is enabled by Collibra: feature.protect.databricks
    Tip This can be done by adding the following JVM parameter via Collibra Console and then restarting the service: -Dfeature.protect.databricks=true
  2. Download the JDBC driver for Databricks.
  3. Create a JDBC connection from your Edge site to Databricks.
    Tip When creating the connection, in the Connection provider field, select Username/Password JDBC connection. In the Connection string field, include EnableArrow=0.
  4. Add the Catalog JDBC ingestion capability to the Edge site.
    Tip 

    When adding the capability:

    • In the Capability template field, selectSelect Catalog JDBC Ingestion.
    • In the JDBC Connection field, select the connection that you created in Step 2.
  5. Register and synchronize the data source.
Tip The following image shows an ingested Databricks database. The Data Source Type attribute containing the value SparkSQL is added to the database asset only after the Catalog JDBC ingestion process is complete.

Ingested Databricks database

Ingested Databricks database

  1. Download the JDBC driver for Snowflake.
  2. Create a JDBC connection from your Edge site to Snowflake.
    Tip When creating the connection, in the Connection provider field, select Username/Password JDBC connection.
  3. Add the Catalog JDBC ingestion capability to the Edge site.
    Tip 

    When adding the capability:

    • In the Capability template field, selectSelect Catalog JDBC Ingestion.
    • In the JDBC Connection field, select the connection that you created in Step 2.
  4. Register and synchronize the data source.
Tip The following image shows an ingested Snowflake database. The Data Source Type attribute containing the value Snowflake is added to the database asset only after the Catalog JDBC ingestion process is complete.

Ingested Snowflake database

Ingested Snowflake database

Steps

  1. Contact Collibra Support or your representative to enable Protect on your Collibra environment.
  2. Ensure that the Protect global roles and global permissions are correctly set.

    Image of the Protect global roles

    Image of the Protect global roles

  3. Create an AWS connection from the Edge site to Amazon Athena.
    Tip 
    • When creating the connection, in the Connection provider field, select AWS connection.
    • Ensure that the user associated with the Access Key ID used in the connection has the required permissions.
  4. Add the Protect for AWS Lake Formation capability to the Edge site.
    Tip 
    • When adding the capability:
      • In the Capability template field, selectSelect Collibra Protect for AWS Lake Formation.
      • In the Connection field, select the connection that you created in Step 3.
    • Do not add more than one Protect for AWS Lake Formation capability to the Edge site.
  5. Protect is set up. On the main toolbar, if you click , Protect is shown.

Note Apart from the JDBC connection created for the Catalog ingestion, Protect for BigQuery requires an extra connection, which is the GCP connection. The GCP connection is necessary because Protect requires access to certain GCP APIs that cannot be reached through the JDBC connection alone. The GCP connection ensures that data protection is enforced.

  1. Contact Collibra Support or your representative to enable Protect on your Collibra environment.
  2. Ensure that the Protect global roles and global permissions are correctly set.

    Image of the Protect global roles

    Image of the Protect global roles

  3. Create a GCP connection from the Edge site to Google BigQuery.
    Tip 
    • When creating the connection, in the Connection provider field, select GCP connection.
    • Ensure that the user associated with the GCP Service Account used in the connection has the required permissions.
  4. Add the Protect for BigQuery capability to the Edge site.
    Tip 
    • When adding the capability:
      • In the Capability template field, selectSelect Collibra Protect for Google BigQuery.
      • In the Connection field, select the connection that you created in Step 3.
    • Do not add more than one Protect for BigQuery capability to the Edge site.
    • If the version of the capability is 1.97.1, then ensure that the JSON content in the GCP Service Account field in the GCP connection you created is Base64 encoded. You can find the version of the capability in the Version column on the Capabilities tab.
  5. Protect is set up. On the main toolbar, if you click , Protect is shown.

Watch a video

Note The following video was recorded in the classic user interface.

Databricks

  1. Contact Collibra Support or your representative to enable Protect on your Collibra environment.
  2. Ensure that the Protect global roles and global permissions are correctly set.

    Image of the Protect global roles

    Image of the Protect global roles

  3. Create a Username/Password JDBC connection from the Edge site to Databricks.
    Tip 
    • When creating the connection, in the Connection provider field, select Username/Password JDBC connection.
    • Ensure that the user associated with the Databricks role used in the connection has the required privileges.
  4. Add the Protect for Databricks capability to the Edge site.
    Tip 
    • When adding the capability:
      • In the Capability template field, selectSelect Collibra Protect for Databricks.
      • In the Connection field, select the connection that you created in Step 3.
    • Do not add more than one Protect for Databricks capability to the Edge site.
  5. Protect is set up. On the main toolbar, if you click , Protect is shown.

  1. Contact Collibra Support or your representative to enable Protect on your Collibra environment.
  2. Ensure that the Protect global roles and global permissions are correctly set.

    Image of the Protect global roles

    Image of the Protect global roles

  3. Create a Username/Password JDBC connection from the Edge site to Snowflake.
    Tip 
    • When creating the connection, in the Connection provider field, select Username/Password JDBC connection.
    • Ensure that the user associated with the Snowflake role used in the connection has the required privileges.
  4. Add the Protect for Snowflake capability to the Edge site.
    Tip 
    • When adding the capability:
      • In the Capability template field, selectSelect Collibra Protect for Snowflake.
      • In the Connection field, select the connection that you created in Step 3.
    • Do not add more than one Protect for Snowflake capability to the Edge site.
  5. Protect is set up. On the main toolbar, if you click , Protect is shown.

What's next?