Add the Catalog JDBC ingestion capability
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
Before you can start registering a data source via Edge, you need to add the Catalog JDBC ingestion capability to the JDBC connection for the data source.
Before you begin
- You have created and installed an Edge site.
- You have created a JDBC connection.
- Ensure the max cardinality of the asset attributes is at least 1.
Required permissions
You have a global role that has the Manage connections and capabilities global permission, for example, Edge integration engineer.
Steps
- Open an Edge site.
-
On the main toolbar, click
→
Settings.
The Collibra settings page opens. -
In the tab pane, click Edge.
The Sites tab opens and shows a table with an overview of the Edge sites. - In the table, click the name of the Edge site whose status is Healthy.
The Edge site page opens.
-
On the main toolbar, click
- In the Capabilities section, click Add capability.
The Add capability page is shown. - Select the Catalog JDBC ingestion capability template.
- Enter the required information.
Field Description Required Capability
This section contains general information about the capability.
Name
The name of the Edge capability.
Yes
Description
The description of the Edge capability.
No
Capability template
The capability template. The value that you select in this field determines which sections appear on the page.
Select the following Edge capability:
Catalog JDBC ingestion
Yes
Connection
This section contains information to connect to the data source.
JDBC connectionYes
JDBC data source type (Deprecated)Deprecated field. The field was used to indicate the type of the data source. You no longer need to change this field. The required value is automatically identified.
Note The automatically identified value is not shown in this page.
Yes
Supports schemasA text field where you have to enter True to enable database registration of data sources that have no schema. If the data source has schemas, you can ignore this field.
Tip If the data source does not have a schema, Data Catalog creates a Schema asset with the same name as the full name of the database.
No
Other Settings
Others
This section can contain additional capability properties. Click Add propertyAdd Other Settings to add a property.Show possible propertiesName Description Type Encryption Default value tags-strategy To increase the performance of the Snowflake metadata synchronization if you register source tags, we can read the Snowflake source tags from the
SNOWFLAKE.ACCOUNT_USAGE
schema. To do so, change the value of this property to the valueSINGLE_CALL
.Note This method requires you to have the
SELECT
permission on theSNOWFLAKE.ACCOUNT_USAGE.TAG_REFERENCES
table.Text Not encrypted (plain text) CALL_PER_TABLE
By default, we read the Snowflake source tags from the
<database_name>.INFORMATION_SCHEMA.tag_
references. This method works with the minimum required permissions to perform the metadata scan.You can also define the properties the following jobs that can run in view of a database synchronization.
- database-list-with-metadata: This job runs when you register a data source and the Database field needs to be filled with available databases.
- schema-list: This jobs runs in the Configuration tab of a Database asset to show the schemas in the database.
- ingest-schema: This job runs when you click the Synchronize button.
Name Description Type Encryption Default value Warning The following properties can have a significant impact on your Edge site. Only add or update them together with Collibra Support.
database-list-with-metadata-garbage-collector The garbage collector that is used by this job in the capability.
For information about other possibilities, see AZUL documentation.Text Not encrypted (plain text) -XX:+UseParallelGC database-list-with-metadata-requests-cpu database-list-with-metadata-requests-cpu The minimum amount of CPU computing power requested by by this job in the capability. The amount is expressed in milliCPU. Text Not encrypted (plain text) 100 database-list-with-metadata-limits-cpu database-list-with-metadata-limits-cpu The maximum amount of CPU computing power requested by this job in the capability. The amount is expressed in milliCPU. Text Not encrypted (plain text) 950 database-list-with-metadata-requests-memory The minimum amount of memory requested by this job in the capability.
The amount is expressed in mebibytes (Mi).
Important If you add this property, you also need to add the properties: database-list-with-metadata-limits-memory and database-list-with-metadata-jvm-max-memory.
Text Not encrypted (plain text) 128 database-list-with-metadata-limits-memory The maximum amount of memory requested by this job in the capability.
The amount is expressed in mebibytes (Mi).
Important If you add this property, you also need to add the properties: database-list-with-metadata-requests-memory and database-list-with-metadata-jvm-max-memory.
Text Not encrypted (plain text) 256 database-list-with-metadata-jvm-max-memory The maximum amount of memory that can be used by the Java virtual machine (jvm) for this job. The amount is expressed in mebibytes (Mi).
Warning Make sure this amount is lower than the database-list-with-metadata-limits-memory amount.
Important If you add this property, you also need to add the properties: database-list-with-metadata-requests-memory and database-list-with-metadata-limits-memory.
Text Not encrypted (plain text) 256 schema-list-garbage-collector The garbage collector that is used by this job in the capability.
For information about other possibilities, see AZUL documentation.Text Not encrypted (plain text) -XX:+UseParallelGC schema-list-requests-cpu The minimum amount of CPU computing power requested by this job in the capability. The amount is expressed in milliCPU. Text Not encrypted (plain text) 100 schema-list-limits-cpu The maximum amount of CPU computing power requested by the capability.
The amount is expressed in milliCPU.Text Not encrypted (plain text) 950 schema-list-requests-memory The minimum amount of memory requested by this job in the capability.
The amount is expressed in mebibytes (Mi).
Important If you add this property, you also need to add the properties: schema-list-limits-memory and schema-list-jvm-max-memory.
Text Not encrypted (plain text) 128 schema-list-limits-memory The maximum amount of memory requested by this job in the capability.
The amount is expressed in mebibytes (Mi).
Important If you add this property, you also need to add the properties: schema-list-requests-memory and schema-list-jvm-max-memory.
Text Not encrypted (plain text) 256 schema-list-jvm-max-memory The maximum amount of memory that can be used by the Java virtual machine (jvm) for this job.
The amount is expressed in mebibytes (Mi).
Warning Make sure this amount is lower than the schema-list-limits-memory amount.
Important If you add this property, you also need to add the properties: schema-list-requests-memory and schema-list-limits-memory.
Text Not encrypted (plain text) 256 ingest-schema-garbage-collector The garbage collector that is used by this job in the capability.
For information about other possibilities, see AZUL documentation.Text Not encrypted (plain text) -XX:+UseParallelGC ingest-schema-requests-cpu The minimum amount of CPU computing power requested by this job in the capability.
The amount is expressed in milliCPU.Text Not encrypted (plain text) 100 ingest-schema-limits-cpu The maximum amount of CPU computing power requested by this job in the capability. The amount is expressed in milliCPU. Text Not encrypted (plain text) 1500 ingest-schema-requests-memory The minimum amount of memory requested by this job in the capability.
The amount is expressed in mebibytes (Mi).
Important If you add this property, you also need to add the properties: ingest-schema-limits-memory and ingest-schema-jvm-max-memory.
Text Not encrypted (plain text) 128 ingest-schema-limits-memory The maximum amount of memory requested by the capability.
The amount is expressed in mebibytes (Mi).Important If you add this property, you also need to add the properties: ingest-schema-requests-memory and ingest-schema-jvm-max-memory.
Text Not encrypted (plain text) 2048 ingest-schema-jvm-max-memory The maximum amount of memory that can be used by the Java virtual machine (jvm) for this job. The amount is expressed in mebibytes (Mi).
Warning Make sure this amount is lower than the ingest-schema-limits-memory amount.
Important If you add this property, you also need to add the properties: ingest-schema-requests-memory and ingest-schema-limits-memory.
Text Not encrypted (plain text) 2048 http-connect-timeout-seconds The maximum amount of time allowed to create a connection with Collibra.
The value must be set to 30 or higher.
Text Not encrypted (plain text) 30 http-read-timeout-seconds The maximum amount of time allowed to wait for a response before closing the connection.
The value must be set to 300 or higher.
Text Not encrypted (plain text) 300 Note No validation is performed on the values you add.
No
General
This section contains general information about logging.
Debug
An option to automatically send Edge infrastructure log files to Collibra Platform. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.For more information, go to logging.
No
Log level
An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
- database-list-with-metadata: This job runs when you register a data source and the Database field needs to be filled with available databases.
- Click Create.
The capability is added to the Edge site.
The fields become read-only.
What's next?
If needed, add the JDBC Profiling capability as well to the connection.
You can then register a data source via Edge.