Integrated Databricks Unity Catalog data

Databricks Unity Catalog objects are mapped to Collibra assets during synchronization. Once synchronization is completed, you can explore asset attributes or metadata synchronized per asset type.

After the synchronization, Database, Schema, Table, Database View, and Column assets become available in Collibra.

Tip  You can also integrate AI models using the Databricks Unity Catalog for AI integration. If you want to integrate Databricks AI models, ensure that AI Governance is enabled. Without AI Governance, the AI integration functionality is limited. For more information, go to the Databricks Unity Catalog for AI documentation.

Asset location

Asset status

The status of the assets depends on the selected value in the Default Asset Status field during synchronization.

When you integrate a data source without applying Include or Exclude Mappings rules, and then later exclude a integrated asset using an Include or Exclude Mapping during resynchronization, the related assets receive the Missing from Source status.

Note If a temporary communication issue results in a partial synchronization, the status of the assets that were not synchronized becomes Missing from source. If the assets are identified in the source system during the next fully successful synchronization, the previous statuses are restored.

Integrated assets and their metadata

Database, Schema, Table, Database View, and Column assets are added.

You can enable a multi-path hierarchy to show the assets in a tree structure. For the best results, use the following relations in the multi-path hierarchy:

  1. Technology Asset groups Technology Asset
  2. Database contains Table
  3. Technology Asset has Schema
  4. Schema contains Table
  5. Table contains Column

The following image shows the resulting hierarchical table.

Synchronized metadata per asset type

This table shows the metadata for each Databricks asset type.

Asset type

Synchronized metadata

Public ID
Database

Description from source system

DescriptionFromSourceSystem
Owner in source OwnerInSource
Source Tags, if the HTTP path has been defined in the capability and the Databricks access token or OAuth Client has the required permissions.

The tag naming convention is <source_tag_name>:<source_tag_value>.
If the value is empty, we show <source_tag_name>:.

We fetch source tags from the Databricks Unity Catalog information schema using SQL; everything else is fetched by REST API.

SourceTags

Data Source Type

The value is automatically set to Databricks Unity Catalog.

DataSourceType

Any extensible properties defined in the configuration.

 
Technology Asset groups / is grouped by Technology Asset TechnologyAssetHasSchema
Schema

Description from source system

DescriptionFromSourceSystem
Owner in source OwnerInSource
Source Tags, if the HTTP path has been defined in the capability and the Databricks access token or OAuth Client has the required permissions.

The tag naming convention is <source_tag_name>:<source_tag_value>.
If the value is empty, we show <source_tag_name>:.

We fetch source tags from the Databricks Unity Catalog information schema using SQL; everything else is fetched by REST API.

SourceTags

Data Source Type

The value is automatically set to Databricks Unity Catalog.

DataSourceType
Any extensible properties defined in the configuration.  
Technology Asset has / belongs to Schema TechnologyAssetHasSchema
Table

Description from source system

DescriptionFromSourceSystem
Owner in source OwnerInSource
Source Tags, if the HTTP path has been defined in the capability and the Databricks access token or OAuth Client has the required permissions.

The tag naming convention is <source_tag_name>:<source_tag_value>.
If the value is empty, we show <source_tag_name>:.

We fetch source tags from the Databricks Unity Catalog information schema using SQL; everything else is fetched by REST API.

SourceTags

Any extensible properties defined in the configuration.

 
Schema contains / is part of Table SchemaContainsTable
Database View (includes Databricks metric views)

Description from source system

DescriptionFromSourceSystem
Owner in source OwnerInSource
Source Tags, if the HTTP path has been defined in the capability and the Databricks access token or OAuth Client has the required permissions.

The tag naming convention is <source_tag_name>:<source_tag_value>.
If the value is empty, we show <source_tag_name>:.

We fetch source tags from the Databricks Unity Catalog information schema using SQL; everything else is fetched by REST API.

SourceTags
Schema contains / is part of Table SchemaContainsTable
Column

Description from source system

DescriptionFromSourceSystem
Source Tags, if the HTTP path has been defined in the capability and the Databricks access token or OAuth Client has the required permissions.

The tag naming convention is <source_tag_name>:<source_tag_value>.
If the value is empty, we show <source_tag_name>:.

We fetch source tags from the Databricks Unity Catalog information schema using SQL; everything else is fetched by REST API.

SourceTags
Column Position ColumnPosition
Is Nullable IsNullable
Is Primary Key IsPrimaryKey
Primary Key Name (if the column is the primary key) PrimaryKeyName
Original Name OriginalName

Technical Data Type

Tip 

You see the technical data type in the Technical Data Type field in the At a glance sidebar of the Column asset. If the At a glance sidebar is hidden, click Info icon. For columns that have a structured technical data type, Array or Struct, click the hyperlink to see the structure of the data in a dialog box. In other locations, for example, in Table assets, click the View Array or View Struct button to open the dialog box.

TechnicalDataType
Column is part of / contains Table ColumnIsPartOfTable
Foreign Key Mapping (if the column is part of a foreign key) ForeignKeyMapping
Foreign Key The full name of the Foreign Key asset has the following pattern: table_full_name > foreign_key_name (foreign_key)  
Foreign Key Mapping ForeignKeyMapping

What's next

Depending on your needs and setup, you can profile and classify the data for the integrated assets. For more information, go to Steps: Integrate Databricks Unity Catalog via Edge.