Add the Catalog JDBC ingestion capability

Important

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

Latest UI Classic UI

Before you can start registering a data source via Edge, you need to add the Catalog JDBC ingestion capability to the JDBC connection for the data source.

Note If you're using a Collibra Cloud site, go the Collibra Cloud site documentation to check if your data source is supported.

Before you begin

You either created and installed an Edge site or were granted a Collibra Cloud site.
You have created a JDBC connection.
Ensure the max cardinality of the asset attributes is at least 1.

Required permissions

You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.

Steps

Open a site.
1. On the main toolbar, click → Settings.
  The Settings page opens.
2. In the tab pane, click Edge.
  The Sites tab opens and shows a table with an overview of your sites.
3. In the table, click the name of the site whose status is Healthy.
  The site page opens.
In the Capabilities section, click Add capability.
The Add capability page is shown.
Select the Catalog JDBC ingestion capability template.

Enter the required information.

Field Description Required

Capability

This section contains general information about the capability.

Name

The name of the capability.

Yes

Description

The description of the capability.

Capability template

The capability template. The value that you select in this field determines which sections appear on the page.

Select the following capability:

Catalog JDBC ingestion

Yes

Connection

This section contains information to connect to the data source.

JDBC connection

The connection to the data source.

Yes

JDBC data source type (Deprecated)

Deprecated field. The field was used to indicate the type of the data source. You no longer need to change this field. The required value is automatically identified.

Note The automatically identified value is not shown in this page.

Yes

Supports schemas

A text field where you have to enter True to enable database registration of data sources that have no schema. If the data source has schemas, you can ignore this field.

Tip If the data source does not have a schema, Data Catalog creates a Schema asset with the same name as the full name of the database.

Other Settings

Others

This section can contain additional capability properties.

Click Add propertyAdd Other Settings to add a property.

Show possible properties

Name Description Type Encryption Default value

tags-strategy

To increase the performance of the Snowflake metadata synchronization if you register source tags, we can read the Snowflake source tags from the SNOWFLAKE.ACCOUNT_USAGE schema. To do so, change the value of this property to the value SINGLE_CALL.

Note This method requires you to have the SELECT permission on the SNOWFLAKE.ACCOUNT_USAGE.TAG_REFERENCES table.

Text

Not encrypted (plain text)

CALL_PER_TABLE

By default, we read the Snowflake source tags from the <database_name>.INFORMATION_SCHEMA.tag_ references. This method works with the minimum required permissions to perform the metadata scan.

You can also define the properties the following jobs that can run in view of a database synchronization.

database-list-with-metadata: This job runs when you register a data source and the Database field needs to be filled with available databases.
schema-list: This jobs runs in the Configuration tab of a Database asset to show the schemas in the database.
ingest-schema: This job runs when you click the Synchronize button.

Name	Description	Type	Encryption	Default value
Warning The following properties can have a significant impact on your Edge site. Only add or update them together with Collibra Support.
database-list-with-metadata-garbage-collector	The garbage collector that is used by this job in the capability. For information about other possibilities, see AZUL documentation.	Text	Not encrypted (plain text)	-XX:+UseParallelGC
database-list-with-metadata-requests-cpu	database-list-with-metadata-requests-cpu The minimum amount of CPU computing power requested by by this job in the capability. The amount is expressed in milliCPU.	Text	Not encrypted (plain text)	100
database-list-with-metadata-limits-cpu	database-list-with-metadata-limits-cpu The maximum amount of CPU computing power requested by this job in the capability. The amount is expressed in milliCPU.	Text	Not encrypted (plain text)	950
database-list-with-metadata-requests-memory	The minimum amount of memory requested by this job in the capability. The amount is expressed in mebibytes (Mi). Important If you add this property, you also need to add the properties: database-list-with-metadata-limits-memory and database-list-with-metadata-jvm-max-memory.	Text	Not encrypted (plain text)	128
database-list-with-metadata-limits-memory	The maximum amount of memory requested by this job in the capability. The amount is expressed in mebibytes (Mi). Important If you add this property, you also need to add the properties: database-list-with-metadata-requests-memory and database-list-with-metadata-jvm-max-memory.	Text	Not encrypted (plain text)	256
database-list-with-metadata-jvm-max-memory	The maximum amount of memory that can be used by the Java virtual machine (jvm) for this job. The amount is expressed in mebibytes (Mi). Warning Make sure this amount is lower than the database-list-with-metadata-limits-memory amount. Important If you add this property, you also need to add the properties: database-list-with-metadata-requests-memory and database-list-with-metadata-limits-memory.	Text	Not encrypted (plain text)	256

schema-list-garbage-collector	The garbage collector that is used by this job in the capability. For information about other possibilities, see AZUL documentation.	Text	Not encrypted (plain text)	-XX:+UseParallelGC
schema-list-requests-cpu	The minimum amount of CPU computing power requested by this job in the capability. The amount is expressed in milliCPU.	Text	Not encrypted (plain text)	100
schema-list-limits-cpu	The maximum amount of CPU computing power requested by the capability. The amount is expressed in milliCPU.	Text	Not encrypted (plain text)	950
schema-list-requests-memory	The minimum amount of memory requested by this job in the capability. The amount is expressed in mebibytes (Mi). Important If you add this property, you also need to add the properties: schema-list-limits-memory and schema-list-jvm-max-memory.	Text	Not encrypted (plain text)	128
schema-list-limits-memory	The maximum amount of memory requested by this job in the capability. The amount is expressed in mebibytes (Mi). Important If you add this property, you also need to add the properties: schema-list-requests-memory and schema-list-jvm-max-memory.	Text	Not encrypted (plain text)	256
schema-list-jvm-max-memory	The maximum amount of memory that can be used by the Java virtual machine (jvm) for this job. The amount is expressed in mebibytes (Mi). Warning Make sure this amount is lower than the schema-list-limits-memory amount. Important If you add this property, you also need to add the properties: schema-list-requests-memory and schema-list-limits-memory.	Text	Not encrypted (plain text)	256

ingest-schema-garbage-collector	The garbage collector that is used by this job in the capability. For information about other possibilities, see AZUL documentation.	Text	Not encrypted (plain text)	-XX:+UseParallelGC
ingest-schema-requests-cpu	The minimum amount of CPU computing power requested by this job in the capability. The amount is expressed in milliCPU.	Text	Not encrypted (plain text)	100
ingest-schema-limits-cpu	The maximum amount of CPU computing power requested by this job in the capability. The amount is expressed in milliCPU.	Text	Not encrypted (plain text)	1500
ingest-schema-requests-memory	The minimum amount of memory requested by this job in the capability. The amount is expressed in mebibytes (Mi). Important If you add this property, you also need to add the properties: ingest-schema-limits-memory and ingest-schema-jvm-max-memory.	Text	Not encrypted (plain text)	128
ingest-schema-limits-memory	The maximum amount of memory requested by the capability. The amount is expressed in mebibytes (Mi). Important If you add this property, you also need to add the properties: ingest-schema-requests-memory and ingest-schema-jvm-max-memory.	Text	Not encrypted (plain text)	2048
ingest-schema-jvm-max-memory	The maximum amount of memory that can be used by the Java virtual machine (jvm) for this job. The amount is expressed in mebibytes (Mi). Warning Make sure this amount is lower than the ingest-schema-limits-memory amount. Important If you add this property, you also need to add the properties: ingest-schema-requests-memory and ingest-schema-limits-memory.	Text	Not encrypted (plain text)	2048

http-connect-timeout-seconds	The maximum amount of time allowed to create a connection with Collibra. The value must be set to 30 or higher.	Text	Not encrypted (plain text)	30
http-read-timeout-seconds	The maximum amount of time allowed to wait for a response before closing the connection. The value must be set to 300 or higher.	Text	Not encrypted (plain text)	300

Note No validation is performed on the values you add.

General

This section contains general information about logging.

Debug

An option to automatically send Edge infrastructure log files to Collibra Platform. By default, this option is set to false.

Note We highly recommend to only send Edge infrastructure log files to Collibra Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

For more information, go to logging.

Log level

An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

Click Create.
The capability is added to the Edge or Collibra Cloud site.
The fields become read-only.

What's next?

If needed, add the JDBC Profiling capability as well to the connection.
You can then register a data source via Edge.