About synchronizing schemas

Synchronizing schemas is the process of updating the metadata of a registered data source in Collibra Data Intelligence Cloud.

You can synchronize a schema manually or automatically at fixed intervals:

Synchronization process

  • After you registered a data source via Edge, Data Catalog connects to your Edge site to create a list of schemas from the registered database.
    You can see the schema list on the Configuration tab page of the Database asset page.
  • You can refresh the schema list in the Configuration tab page, by clicking the Refresh List icon.
  • You can synchronize all schemas that have a table rule.
  • During the synchronization process, the Edge site connects to your data source again and ingests all schemas, tables and columns according to the table rules. Collibra Data Intelligence Cloud also detects whether there are changes since the last synchronization of a schema. Edge resolves the possible conflicts in the following way:

    Change in data sourceResult in CollibraRequired action
    A table, column or foreign key has been added to the schema.Collibra creates the assets.No action is required of you.
    A table, column or foreign key has been removed from the schema.The existing asset receives the Missing from source status.
    If it concerns a table, also the related Column assets receive the Missing from source status.
    If needed, you can manually delete the assets.
    A schema has been removed.The schema receives the Missing from source status. Also the related Table and Column assets receive the Missing from source status.If needed, you can manually delete the Schema asset and all related assets.
    A column or foreign key has been renamed.
    • Collibra creates an asset with the new name.
    • The existing asset receives the Missing from source status.
    If needed, you can apply any manual changes you made to the original asset, to the new asset. And then remove the assets that are no longer applicable.
    A table has been renamed.
    • Collibra creates a Table asset with the new name. Collibra also creates new Column assets for the new Table asset.
    • The existing Table and related Column assets receive the Missing from source status.
    If needed, you can apply any manual changes you made to the original assets, to the new assets. And then remove the assets that are no longer applicable.
    A schema has been renamed.
    • Collibra creates a Schema asset with the new name. Collibra also creates new Table and Column assets for the new Schema asset.
    • The existing schema and related assets receive the Missing from source status.
    If needed, you can apply any manual changes you made to the original assets, to the new assets. And then delete the assets that are no longer applicable.

    Schema, Table, Column or Foreign Key assets with the Missing from source status don't block the synchronization process.

    Note In the asset diagram, assets with the Missing from source status are shown by default. If you don't want to see these assets, apply a filter to the diagram view to only display assets with valid statuses.

  • If a schema is synchronized, you can see a check symbol () beside the schema name. If the synchronization of a schema failed or the schema is no longer available in the source, an exclamation mark () is shown instead.
    You can also see the synchronization status in the Activities list.

Table rules

The table rule determines which tables of a schema you synchronize in Data Catalog. Only schemas that have a table rule can be synchronized. If a schema has a table rule, you can see a table icon () beside the schema name.

The following table shows fields of table rules:

Table rule field Description
Include

A comma-separated list of the names of the tables you want to synchronize.

  • In the list, add a space after each comma. For example, CUSTOMERS, ORDER, SKU.
  • You can use * as a wildcard.
  • The default value is *, which means all tables are taken into account.
  • The Include field takes priority over the field.
Example 
  • Out of all tables in a schema, you only want to synchronize the table with name "CUSTOMERS" and the tables with a name that starts with "ORDER".
    To do this:
    In the Include field, enter: CUSTOMERS, ORDER*.
  • Out of all tables in a schema, you only want to synchronize the tables with a name that contains "SKU".
    To do this:
    In the Include field, enter: *SKU*.
Exclude

A comma-separated list of the names of the tables you do not want to synchronize.

  • In the list, add a space after each comma. For example, CUSTOMERS, ORDER, SKU.
  • You can use * as a wildcard.
  • The Include field takes priority over the field.

You can use the Exclude table rule to do the following:

  • Synchronize all tables in a schema except the ones defined in the Exclude field.
  • Synchronize only tables as defined in the Include field, with the exception of tables that are listed in the Exclude field.
Example 
  • Out of all tables in a schema, you do not want to synchronize a table with the name "ADDRESS" and tables with a name that ends with "PHONE".
    To do this:
    In the Include field, enter: * and in the Exclude field, enter: ADDRESS, *PHONE.
  • Out of all tables in a schema, you want to synchronize the tables with a name that starts with "SKU", but exclude the tables with a name that contains "bkp".
    To do this:
    In the Include field, enter: SKU* and in the Exclude field, enter: *bkp*.
    From the following list, only "SKU_1" and "SKU_2" will be synchronized.
    SKU_1 , SKU_2, SKU_bkp_1, SKU_bkp_2, New, bkp, bkp_SKU
Target domain

The Physical Data Dictionary domain in which the schema is synchronized.

The default value is Schema domain: the metadata is placed in a domain located in the same community as the domain of your Database asset. If that domain doesn't exist yet, Data Catalog creates it.

You can select any other Physical Data Dictionary domain for which you have a resource role with the Configure external system resource permission.

Options

Additional options to specify which type of tables you want to synchronize.

Skip database views

A checkbox to exclude database views from the synchronization process. If selected, no assets of the type Database view are created.

Tip You can also use the table rules to include or exclude specific database views.