Registering a Databricks file system via Databricks JDBC connector and Edge

June 1, 2026

You can register a Databricks data source using the Databricks JDBC connector to ingest your metadata in Collibra. This process creates assets that represent your Databricks tables and columns, providing a clear view of your data landscape. You can also configure the connection to retrieve sample data, profile your data, and set up data classification.

Prerequisites

You either created and installed an Edge site or were granted a Collibra Cloud site.
Required settings for database registration via Edge are enabled. For more information, go to Database registration via Edge.
If needed, your environment is set up to allow profiling and classification via Edge. For more information, go to Database profiling via Edge and Set up Unified Data Classification.

Steps

	Step	What	Description	Results
Preparation	1	Add a Databricks JDBC connection to your Edge or Collibra Cloud site	Adds a Databricks JDBC connection to your Edge or Collibra Cloud site.
Preparation	2	Add the following capabilities: Catalog JDBC ingestion JDBC Profiling Set up Unified Data Classification If you also want to collect sample data, Catalog JDBC Sampling.	Adds the required capabilities to the Databricks connection.
Setup	3	Register the data source	Registering a data source creates the metadata structure in Collibra.	A Physical Data Dictionary domain containing a Database asset is created. A list of available schemas is created on the Configuration tab of the Database asset.
Setup	4	Configure the synchronization of your data source	Making a selection of schemas and tables that you want to ingest.	The information on the Configuration tab of the Database asset is filled.
Registration	5	Synchronize one or more schemas manually Add a synchronization schedule to synchronize automatically	Synchronizing the schema of a registered data source to make the metadata available in Collibra.	Schema, Table, Column, and Foreign Keys assets are created in the specified domain, and registration data becomes available.
	6	Optionally, profile the synchronized data.	Data Profiling creates a summary of a data source in Data Catalog and determines the data type of columns in the data source. The summary mainly contains statistics and graphics to give the user an idea what the data is about. Data Profiling is available for registered JDBC data sources and for Databricks Unity Catalog and Dataplex Catalog data sources integrated via Edge.	The Table and Column assets contain profiling information.
	7	Optionally, classify the synchronized data.	Creates data classification suggestions for the Column assets.	The Column assets are classified.

Helpful resources

For general information on working with Databricks in Collibra, go to Ways to work with Databricks.

For more information about Databricks, go to Databricks documentation.