About preparing an Edge or Collibra Cloud for data sources

After you create an Edge or request a Collibra Cloud, you can start creating connections to your data sources. You can then add capabilities that use these connections to get information from the data sources to Collibra.

Typically, you create a connection for a data source and add capabilities for this connection. It is important to have a connection set up with the correct information and add the correct capabilities based on the data source and your needs. Each connection and capability may have slightly different steps or requirements, so be sure to review the data source specific information.

For example, you set up a PostgreSQLdata source connection. This is a JDBCconnection. You want to integrate the metadata in Collibra, profile the data, and get samples. For the connection to be able to do this, you need to add the Catalog JDBC ingestion capability, JDBC Profiling capability, and Catalog JDBC Sampling capability for the connection.

JDBC (Java Database Connectivity) integrations allow you to connect directly to your data source from your Edge or Collibra Cloud. When you create a JDBC connection, you will enter your login credentials, which will then be stored for authentication. This means that you don't need to enter these credentials again for any capability that uses this JDBC connection.

If an integration capability does not connect to a JDBC data source, it has to connect on its own by using the information provided by Edge or Collibra Cloud. The connection information is defined and stored as a Connection instance. The connection properties are shown on an Edge or Collibra Cloud's Connections tab.

Steps

See a general overview of the Edge and Collibra Cloud integration process below:

1 Create a connection.

A connection links your Edge or Collibra Cloud with your data source, whether that be a database, file share, or REST service. The subsequent capability jobs that are run through this connection send information back to your Collibra Platform.

For more information, go to our list of available Edge and Collibra Cloud connections.

2 Create a capability.

A capability calls to your data source, and sends the metadata back to your Collibra Platform. The end results are your assets, schemas, tables, and so on.

For more information, go to our list of available Edge and Collibra Cloud capabilities.

What's next?

Once your Edge or Collibra Cloud is prepared, you can use the capabilities. In most cases, you need to first make sure metadata is available in Collibra. The way to do this differs depending on your data source.

  • For JDBC connections:
    • When you create a JDBC connection to a data source, you must first register the data source in your Collibra Platform. This creates a Database asset that you then need to synchronize. The synchronization process ingests metadata from the data source into Collibra. This results in assets with information, such as Schema assets, Tables assets, and so on. Collibra does not include the actual data from the data source, only the data about the data. This full flow is called register a data source. For more information, go to About registering a data source.
  • For non-JDBC connections:
    • When you create any other kind of connection, you only need to synchronize the data source in your Collibra Platform. The synchronization process ingests metadata from the data source into Collibra. This results in assets with information, such as Schema assets, Tables assets, and so on. And creates a structure of the assets that represents the structure in the data source. For more information about synchronizing non-JDBC integrations, go to the data source specific documentation.