Steps overview: Data source registration via Edge

You can use Edge to register a data source and make metadata from the data source available in Collibra Platform while keeping the data in your infrastructure. The data flow is as follows:

Image showing the flow of the metadata from a data source to Collibra

Prerequisites

  • Verify the following settings and ensure:
    • The Database registration via Edge setting is enabled to allow registering a data source via Edge.
    • The Maximum number of concurrent Edge jobs setting meets your needs.
    • The Maximum number of ingestion rules setting meets your needs.
  • You have created and installed an Edge site.
  • You have created a JDBC connection for your data source and have added the capabilities: Catalog JDBC ingestion capability.
    If no JDBC connections and capabilities are configured, the following message appears in the Data Source Registration page: "No data available" and you can't continue registering the data source.

Steps

The following table shows the steps required to register a data source via Edge.

Step

What?

Description

Results

1 Register a data source

Creates the structure for the metadata in Collibra.

  • A Physical Data Dictionary domain containing a Database asset is created.
  • A list of available schemas is created on the Configuration tab page of the Database asset.

For more information, go to the full overview.

2

Configure the synchronization of your data source

Allows you to define synchronization rules to indicate what you want to register. You can:

  • Include and exclude tables of the schema.
  • Specify the target domain in which to create assets.
  • Exclude database views.
  • Include source tags.

The information on the Configuration tab page of the Database asset is completed.

3

Makes the metadata available in Collibra.

Schema, Table, Column and Foreign Keys assets are created in the specified domain, and registration data becomes available.

For more information, go to the full overview.

4 Optionally, profile the synchronized data.

Data Profiling creates a summary of a data source in Data Catalog and determines the data type of columns in the data source. The summary mainly contains statistics and graphics to give the user an idea what the data is about.

Data Profiling is available for registered JDBC data sources and for Databricks Unity Catalog and Dataplex Catalog data sources integrated via Edge.

The Table and Column assets contain profiling information.

Helpful resources

If you want to learn more and watch a recording, check out the Data source registration via Edge training on Collibra University.