Registering and synchronizing a data source via Edge

Important 

In Collibra 2024.02, we've launched a new user interface (UI) in beta for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

Registering a data source via Edge makes metadata from the data source available in Collibra Data Intelligence Platform. The data flow is as follows:

Image showing the dataflow of the metadata from a data source to DIC database

Before you begin

Steps

The following table shows the steps required for registering a data source via Edge.

Step

What?

Description

Results

1 Register a data source

Creates the structure for the metadata in Collibra.

  • A Physical Data Dictionary domain containing a Database asset is created.
  • A list of available schemas is created on the Configuration tab page of the Database asset.

2

Configure the synchronization of your data source

Allows you to define synchronization rules to indicate what you want to register. You can:

  • Include and exclude tables of the schema.
  • Specify the target domain in which to create assets.
  • Exclude database views.
  • Include source tags.

The information on the Configuration tab page of the Database asset is completed.

3

Makes the metadata available in Collibra.

Schema, Table, Column and Foreign Keys assets are created in the specified domain, and registration data becomes available.

4 If needed, profile and classify the synchronized data.

Data profiling creates a summary of a data source that is registered with Data Catalog and determines the data type of columns in the data source. The summary mainly contains statistics and graphics to give the user an idea what the registered data is about.

Classification analyzes and predicts the content of registered data sources based on a subset of the data itself, helping you to easily gain insights on what kinds of data you have and where it resides.

The Table and Column assets contain profiling information and the Columns are classified.

After registering a data source via Edge

When the registration is complete:

  • A message at the top right tells you that the database registration is complete. A domain and a Database asset are immediately created.
  • A workflow to assign a technical steward to the new domain is started. This is a simple out-of-the-box workflow that you can edit to fit your organization's needs. When you have assigned a technical steward, that technical steward has to set the security classification and indicate whether the data elements contain personally identifiable information (PII).
  • If you registered a database without schemas, a new Schema asset is automatically created with the same name as the database or with a name as defined in the Edge capability.
  • You can synchronize schemas in the database, including all tables, columns, views and foreign keys. Collibra creates assets in the selected target domains.
    • The synchronization jobs of all schemas run in parallel.
    • Collibra creates reports:
      • during the synchronization, to show the progress of the synchronization job.
      • after synchronizing, to show the synchronization logs for each synchronized schema.
    • The created assets receive a unique full name (fully qualifying name) based on naming conventions.
      You can view the full name of an asset by editing the asset.

      Warning Do not edit the full name of assets needed to synchronize or refresh data sources. This may cause unexpected results and break the synchronization or refresh process.

    • For information on how changes in a data source are handled (soft delete), see About synchronizing schemas. If you rename a database in the data source, the Edge synchronization process will consider it a new database. We don’t detect the renaming of a database at this moment.
    • If you included the source tags, the tags defined on the assets in the data source are registered and available from the Schema, Table, Database View, and Column assets in the Source Tags attribute.

      Note Currently, you can synchronize source tags only from Snowflake.

Additional information

If you want to learn more and watch a recording, check out the Data source registration via Edge training on Collibra University.