Registering a data source via Edge

Registering a data source via Edge makes metadata from the data source available in Collibra Data Intelligence Cloud.

Tip You can also register a data source via Jobserver.

Steps

The following table shows the steps required for registering a data source via Edge.

Step

What?

Description

Results

0

Prerequisites

When you register a data source, the Register content page shows a list of available JDBC connections that you can use to register your database.

1 Register a data source

Registering a data source creates the structure for the metadata in Collibra.

  • A Physical Data Dictionary domain containing a Database asset is created.
  • A list of available schemas is created on the Configuration tab page of the Database asset.

2

Configure the synchronization of your data source

Making a selection of schemas and tables that you want to ingest.
When you select a schema to ingest, you can set the table rules to:

  • Include and exclude tables of the schema.
  • Specify the target domain in which to create assets.
  • Exclude database views.

The information on the Configuration tab page of the Database asset is filled in.

3

Synchronizing the schema of a registered data source to make the metadata available in Collibra.

Schema, Table, Column and Foreign Keys assets are created in the specified domain, and registration data becomes available.

4 If needed, profile and classify the synchronized data.

Data profiling creates a summary of a data source that is registered with Data Catalog and determines the data type of columns in the data source. The summary mainly contains statistics and graphics to give the user an idea what the registered data is about.

Classification analyzes and predicts the content of registered data sources based on a subset of the data itself, helping you to easily gain insights on what kinds of data you have and where it resides.

The Table and Column assets contain profiling information and the Columns are classified.

After registering a data source via Edge

When the registration is complete:

  • A message at the top right tells you that the database registration is complete. A domain and a Database asset are immediately created.
  • A workflow to assign a technical steward to the new domain is started. This is a simple out-of-the-box workflow that you can edit to fit your organization's needs. When you have assigned a technical steward, that technical steward has to set the security classification and indicate whether the data elements contain personally identifiable information (PII).
  • If you registered a database without schemas, a new Schema asset is automatically created with the same name as the database or with a name as defined in the Edge capability.
  • You can synchronize schemas in the database, including all tables, columns, views and foreign keys. Collibra creates assets in the selected target domains.
    • The synchronization jobs of all schemas run in parallel.
    • Collibra creates reports:
      • during the synchronization, to show the progress of the synchronization job.
      • after synchronizing, to show the synchronization logs for each synchronized schema.
    • The created assets receive a unique full name based on the following naming convention: [asset parent full name]>[asset name]
      Asset typeNaming conventionExample
      DatabaseedgeConnectionName>jdbccatalog
      where jdbccatalog is the name retrieved from the JDBC "catalog" property.
      Posgresql xs-gxsQ>posgresqlsmall
      SchemaedgeConnectionName>jdbccatalog>schemaNamePosgresql xs-gxsQ>posgresqlsmall>public
      TableedgeConnectionName>jdbccatalog>schemaName>tableNamePosgresql xs-gxsQ>posgresqlsmall>public>Condition
      Database viewedgeConnectionName>jdbccatalog>schemaName>viewNamePosgresql xs-gxsQ>posgresqlsmall>public>PriorConditions
      ColumnedgeConnectionName>jdbccatalog>schemaName>tableName>columnName(column)
      edgeConnectionName>jdbccatalog>schemaName>viewName>columnName(column)
      Posgresql xs-gxsQ>posgresqlsmall>public>Condition>period.end(column)
      Foreign keyedgeConnectionName>jdbccatalog>schemaName>foreignKeyName(foreign key)Posgresql xs-gxsQ>posgresqlsmall>public>con.id(foreign key)

      You can view the full name of an asset by editing the asset.

      Warning Do not edit the full name of assets needed to synchronize or refresh data sources. This may cause unexpected results and break the synchronization or refresh process.