Integrated Databricks Unity Catalog data
In Collibra 2024.02, we've launched a new user interface (UI) in beta for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
After the synchronization, the resulting Database, Schema, Table, and Column assets are available in the domain where the provided System asset is located.
-
If you move the resulting Table and Column assets to another domain and you run the integration again, the Table and Column assets will be moved back to their initial domain. However, if you move the resulting Database or Schema asset to another domain, the Database asset will remain in the new domain.
To move all resulting assets to another location permanently, select another System asset in the current synchronization configuration or create a new capability with a synchronization configuration that integrates the data in the new location.ExampleYou created System asset A in Domain A and synchronized Databricks. As a result, Table A and Column A have been added to Domain A. Then, you manually moved Table A and column A to Domain B.
When you synchronize Databricks again, Table A and Column A will move back to Domain A. -
The Databricks Unity Catalog integration uses different naming conventions compared to the Edge JDBC naming conventions. The applied naming conventions are:
Asset type Naming convention Example Database domainName>catalogName ay-tech-domain-4>oleg-test Schema databaseFullName>schemaName ay-tech-domain-4>oleg-test>demo Table schemaFullName>tableName ay-tech-domain-4>oleg-test>demo>dinner Column tableFullName>columnName(column) ay-tech-domain-4>oleg-test>demo>dinner>recipe(column)
Tip Databricks synchronization relies on UUIDs.
Note In case of a partial synchronization caused by a temporary communication issue, the status of the assets that cannot be synchronized is set to Missing from source. Their previous status is restored, if they are found in the source system during the next fully successful synchronization.
By default, the assets are shown in a plain list, but you can enable a multi-path hierarchy to show it in a tree structure. For the best result, use the following relations in the multi-path hierarchy:
- Technology Asset groups Technology Asset
- Database contains Table
- Technology Asset has Schema
- Schema contains Table
- Table contains Column
The following image shows the resulting hierarchical table.
Synchronized metadata per asset type
This table shows the metadata for each Databricks asset type.
Asset type |
Synchronized metadata |
Resource ID |
---|---|---|
Database |
Description from source system |
00000000-0000-0000-0001-000500000074 |
Owner in source | 00000000-0000-0000-0000-200000000001 | |
Source Tags, if the HTTP path has been defined in the capability. The tag naming convention is Note We fetch source tags from the Databricks Unity Catalog information schema using SQL, everything else is fetched by REST API. Tip To have this field available in the asset page, an admin can navigate to the Database asset template and add Source Tags as a field. |
00000000-0000-0000-0011-000500000019 | |
Any extensible properties defined via the capability. |
||
Technology Asset groups / is grouped by Technology Asset | 00000000-0000-0000-0000-000000007054 | |
Schema |
Description from source system Tip To have this field available in the asset page, an admin can navigate to the Schema asset template and add Description from source system as a field. |
00000000-0000-0000-0001-000500000074 |
Owner in source | 00000000-0000-0000-0000-200000000001 | |
Source Tags, if the HTTP path has been defined in the capability. The tag naming convention is Note We fetch source tags from the Databricks Unity Catalog information schema using SQL, everything else is fetched by REST API. |
00000000-0000-0000-0011-000500000019 | |
Any extensible properties defined via the capability. | ||
Technology Asset has / belongs to Schema | 00000000-0000-0000-0000-000000007024 | |
Table |
Description from source system |
00000000-0000-0000-0001-000500000074 |
Owner in source | 00000000-0000-0000-0000-200000000001 | |
Source Tags, if the HTTP path has been defined in the capability. The tag naming convention is Note We fetch source tags from the Databricks Unity Catalog information schema using SQL, everything else is fetched by REST API. |
00000000-0000-0000-0011-000500000019 | |
Any extensible properties defined via the capability. |
||
Schema contains / is part of Table | 00000000-0000-0000-0000-000000007043 | |
Column |
Description from source system |
00000000-0000-0000-0001-000500000074 |
Source Tags, if the HTTP path has been defined in the capability. The tag naming convention is Note We fetch source tags from the Databricks Unity Catalog information schema using SQL, everything else is fetched by REST API. |
00000000-0000-0000-0011-000500000019 | |
Column Position | 00000000-0000-0000-0001-000500000020 | |
Is Nullable | 00000000-0000-0000-0001-000500000011 | |
Is Primary Key | 00000000-0000-0000-0001-000500000015 | |
Primary Key Name (if the column is the primary key) | 00000000-0000-0000-0001-000500000016 | |
Original Name | 00000000-0000-0000-0001-000500000032 | |
Technical Data Type Tip
|
00000000-0000-0000-0000-000000000219 | |
Column is part of / contains Table | 00000000-0000-0000-0000-000000007042 | |
Foreign Key Mapping (if the column is part of a foreign key) | 00000000-0000-0000-0000-000000007504 | |
Foreign Key |
Tip The full name of the Foreign Key asset has the following pattern : table_full_name > foreign_key_name (foreign_key) |
|
Foreign Key Mapping | 00000000-0000-0000-0000-000000007504 |