Steps: Integrate Databricks Unity Catalog via Edge
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
The steps differ depending on whether you want to be able to profile and classify the column data after the Databricks Unity Catalog integration.
Steps to integrate metadata and allow for sampling, profiling, and classification (beta)
# | Step | Description |
---|---|---|
1 | Give the Edge Site user the required permissions. | Ensures the Edge Site user can integrate the metadata. |
2 | Create the required connections |
|
2a
|
Create a Databricks connection to your Edge site. | Creates a Databricks connection to Databricks in an Edge site, which will be used during the metadata synchronization. |
2b
|
Create a Databricks JDBC connection to your Edge site. | Creates a JDBC Databricks connection to Databricks in an Edge site, which will be used during profiling and classification. |
3 | Add the required capabilities | |
3a
|
Add the Databricks Unity Catalog synchronization capability to the Edge site. |
Adds the Databricks Unity Catalog capability to the Edge connections you created for Databricks. |
3b
|
Add the JDBC Catalog Ingestion capability to the Edge site. | Adds the JDBC Catalog Ingestion capability to the JDBC Databricks connection. The capability will allow to retrieve the available databases and schemas in Databricks Unity Catalog during profiling and classification. |
4 | Synchronize Databricks Unity Catalog. |
You can manually synchronize Databricks Unity Catalog or add a synchronization schedule. Once the synchronization is completed, the metadata is integrated. |
5 | Set up and configure data profiling | Goes through the required permission and steps to prepare Edge and Collibra to profile columns in Databricks Unity Catalog. |
6 | Enable and set up Unified Data Classification | Goes through the required permission and steps to prepare Edge and Collibra to classify columns in Databricks Unity Catalog via the Unified Data Classification method. |
7 | Set up and configure the use of sample data | Goes through the required permissions and steps to prepare Edge and Collibra to show sample data for columns in Databricks Unity Catalog. |
Result |
Users with the correct permissions can now re-synchronize the metadata, configure the profiling options and profile the data, automatically classify the data, or request sample data. |
Steps to only integrate the metadata
# | Step | Description |
---|---|---|
1 | Give the Edge Site user the required permissions. | Ensures the Edge Site user can integrate the metadata. |
2 | Create a Databricks connection to your Edge site. | Creates a connection to Databricks in an Edge site. |
3 |
Add the Databricks Unity Catalog capability to the Edge site. | Adds the Databricks Unity Catalog capability to the Edge connection. The capability allows you to retrieve data from Databricks Unity Catalog. |
4 | Synchronize Databricks Unity Catalog. |
You can manually synchronize Databricks Unity Catalog or add a synchronization schedule. Once the synchronization is completed, the metadata is integrated. |
# | Step | Description |
---|---|---|
1 | Give the Edge Site user the required permissions. | Ensures the Edge Site user can integrate the metadata. |
2 | Create a Databricks connection to your Edge site. | Creates a connection to Databricks in an Edge site. |
3 |
Add the Databricks Unity Catalog capability to the Edge site. | Adds the Databricks Unity Catalog capability to the Edge connection. The capability allows you to retrieve data from Databricks Unity Catalog. |
4 | Synchronize Databricks Unity Catalog. |
You can manually synchronize Databricks Unity Catalog or add a synchronization schedule. Once the synchronization is completed, the metadata is integrated. |