Warning Jobserver and all related Jobserver integrations are end of life starting October, 2024, with the exception of Public Sector customers using GovCloud or on-prem environments.
For information about using Catalog connectors on Edge, go to Overview of Catalog connectors.
You can register a database as a data source using one of the JDBC drivers provided by Collibra Marketplace.
- For an overview of the connection details of the various databases, see the JDBC connection details of Collibra-provided drivers.
- You can also do this with your own JDBC drivers.
- This operation should only be executed by your database administrator.
Available JDBC drivers
Collibra supports a wide range of data sources. Currently, you see the information for: |
|
|
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
Prerequisites
- You have a global role with the Catalog global permission, for example, Catalog Author.
- You have configured one or more Jobservers in Collibra Console. If there is no available Jobserver, the Register data source actions will be grayed out in the global create menu of Collibra Data Intelligence Platform.
- If you are using a Collibra Data Intelligence Platform environment with an on-premises Jobserver, both must have the same installer version. You can find the installer version of your Collibra Data Intelligence Platform environment at the bottom of the sign-in window of its Collibra Console, for example 5.9.2-0
- You have a resource role with the following resource permissions on the Schema community:
- Asset > add
- Attribute > add
- Domain > add
- Attachment > add
Steps
-
On the main toolbar, click
, and then click
Catalog.
The Catalog Home opens. -
On the main toolbar, click .
The Create dialog box appears. - In the Create dialog box, click Register a Data Source Using a Collibra Certified Driver.
- Do one of the following:
- Click Select in the row of an existing driver to continue.
- Add a new driver.
- Click Add Driver.
- Enter the required information.
Field Description JDBC Driver Version Name The name of the JDBC driver.
Tip As a best practice, we recommend you use a strict naming convention which includes the data source and a version number. For example: Google BigQuery 1.5 or MySQL 5.9.
Driver files
This table contains a list of uploaded files.
You can remove a driver file by clicking .
Connection string The JDBC connection string.
In the case of , enter this:
Warning Some connection properties can be added to the URL as name-value pairs separated by semicolons. We recommend you not to use this mechanism unless we specify differently in our documentation. Keep in mind that most properties in the URL are ignored. Instead, we recommend you to specify all connection properties in the below section of this dialog box.
Driver Class Name The driver class name of the connection.
In the case of , enter this:
Connection properties This section contains the connection properties.
- Click Save & Continue.
- Edit an existing driver.
- Click in the row of an existing driver.
- Enter the required information.
Field Description JDBC Driver Version Name The name of the JDBC driver.
Tip As a best practice, we recommend you use a strict naming convention which includes the data source and a version number. For example: Google BigQuery 1.5 or MySQL 5.9.
Driver files
This table contains a list of uploaded files.
You can remove a driver file by clicking .
Connection string The JDBC connection string.
In the case of , enter this:
Warning Some connection properties can be added to the URL as name-value pairs separated by semicolons. We recommend you not to use this mechanism unless we specify differently in our documentation. Keep in mind that most properties in the URL are ignored. Instead, we recommend you to specify all connection properties in the below section of this dialog box.
Driver Class Name The driver class name of the connection.
In the case of , enter this:
Connection properties This section contains the connection properties.
- Click Save & Continue.
- Delete an existing driver.
- Click in the row of an existing driver.
- Click Delete to confirm the deletion.
- Configure the data source:
Field Description Schema Name This name is used in Collibra as schema asset and must therefore be unique.
Schema Description The description of the schema. This is used as description of the schema asset. Owner The owner of the registered data in Collibra. Process On The jobserver used for ingesting.
<Connection properties section> This section contains the connection properties.
Login Information This section contains the login information. Store CredentialsSelect this option to store the credentials to access the database. With a schema refresh, you can clear this option again.
UsernameUsername to access the database.
Note This field is ignored if your data source uses any other means of authentication, such as Cyberark, Kerberos, NTLM or any certificate-based authentication method.PasswordCorresponding password to access the database.
Note This field is ignored if your data source uses any other means of authentication, such as Cyberark, Kerberos, NTLM or any certificate-based authentication method.Schedule Data Refresh
Enable or disable a schedule to automatically refresh the data registration. Cron ExpressionSchedule of the data refresh as a Cron pattern.
If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.
Time ZoneThe time zone of the database. Store Data Profile
Option to perform data profiling on the registered data.
Detect advanced data types Option to detect advanced data types in the data source.
Store Sample Data
Option to extract sample data from the registered data.
Tables excluded from registration
Database tables that will not be ingested.
Note- If required, you can exclude multiple tables. To do this, press Enter after typing a value and then type the next.
- You can use an asterisk (
*
) as wildcard to select multiple tables. For example, if you want to exclude the tables that all start with act_, you can enter act_*. - The table names are case sensitive.
- You can add or remove tables from this list by refreshing the schema.
- The Table assets that are created after ingestion have an attribute type called Table Type that defines the type of table that is declared in the data source. For example, TABLE, VIEW,...
- Click Save & Create.
-
On the main toolbar, click
, and then click
Catalog.
The Catalog Home opens. -
On the main toolbar, click .
The Create dialog box appears. - In the Create dialog box, click Register data source (use a Collibra provided driver).
- In the Register data source dialog box, enter the required information.
Field Description Schema Name This name is used in Collibra as schema asset and must therefore be unique.
Schema Description The description of the schema. This is used as description of the schema asset. Owner The owner of the registered data in Collibra. - Click Next.
- If required, add and configure the driver of your preference:
In the JDBC driver version field, click manage drivers....
Note By default, you see the name of the driver that was used last.- Do one of the following:
- Click Add JDBC Driver if you want to create a new JDBC driver.
- Click if you want to edit an existing JDBC driver.
- Enter the required information.
Field Description JDBC Driver Version Name The name of the JDBC driver.
Tip As a best practice, we recommend you use a strict naming convention which includes the data source and a version number. For example: Google BigQuery 1.5 or MySQL 5.9.
Upload Button to upload the relevant files for the data source.
Driver files
This table contains a list of uploaded files.
You can remove a driver file by clicking .
- Click Next.
- Configure the JDBC connection.
Field Description Connection The JDBC connection string.
In the case of , enter this:
Warning Some connection properties can be added to the URL as name-value pairs separated by semicolons. We recommend you not to use this mechanism unless we specify differently in our documentation. Keep in mind that most properties in the URL are ignored. Instead, we recommend you to specify all connection properties in the below section of this dialog box.
Driver Class Name The driver class name of the connection.
In the case of , enter this:
Connection properties This section contains the connection properties.
- Click Create.
- Enter the database connection properties.
Field Description JDBC driver version The JDBC driver to connect to your database. Connect via The jobserver used for ingesting.
<Connection properties section> This section contains the connection properties.
Login Information This section contains the login information. Store CredentialsSelect this option to store the credentials to access the database. With a schema refresh, you can clear this option again.
UsernameUsername to access the database.
Note This field is ignored if your data source uses any other means of authentication, such as Cyberark, Kerberos, NTLM or any certificate-based authentication method.PasswordCorresponding password to access the database.
Note This field is ignored if your data source uses any other means of authentication, such as Cyberark, Kerberos, NTLM or any certificate-based authentication method.Schedule Data Refresh
Enable or disable a schedule to automatically refresh the data registration. Cron ExpressionSchedule of the data refresh as a Cron pattern.
If you create an invalid Cron pattern, Collibra Data Intelligence Platform stops responding.
Time ZoneThe time zone of the database. - Click Next.
- Select the data profiling options.
Field Description Store Data Profile
Option to perform data profiling on the registered data.
Detect advanced data types Option to detect advanced data types in the data source.
Store Sample Data
Option to extract sample data from the registered data.
Tables excluded from registration
Database tables that will not be ingested.
Note- If required, you can exclude multiple tables. To do this, press Enter after typing a value and then type the next.
- You can use an asterisk (
*
) as wildcard to select multiple tables. For example, if you want to exclude the tables that all start with act_, you can enter act_*. - The table names are case sensitive.
- You can add or remove tables from this list by refreshing the schema.
- The Table assets that are created after ingestion have an attribute type called Table Type that defines the type of table that is declared in the data source. For example, TABLE, VIEW,...
- Click Create.
What's next?
The data source is registered and the data is automatically ingested. The ingestion of data is executed in a job. You can see this job in the list of activities.
- If the database contains foreign keys, they will be registered as new assets of the Foreign Key asset type. Assets of this type contain the complex relation, which is the link between all column assets that are part of the foreign key definition.
However, the complex relation is not created if a column is part of a table that is added to the list of Tables excluded from registration. - If you exclude a table during the schema refresh, the corresponding table, column assets and foreign key mapping will be deleted.