Steps overview: Integrate Amazon Redshift via Edge

The integration steps vary slightly depending on how you choose to connect to your data source.

Integrate via JDBC connection

# Step Description
1 Review the preflight checks.

Key considerations to help ensure successful integration, including required Edge, technical lineage, and data source-specific permissions, network requirements and more.

2

Create a JDBC connection to Redshift.

For Collibra Data Lineage to connect to and retrieve metadata from your data source, create a JDBC connection. If you have set up a JDBC connection when you registered the data source via Edge, you can use the existing JDBC connection.

3

Add the Technical Lineage for Amazon Redshift capability for JDBC connections.

Add the technical lineage capability to your Edge or Collibra Cloud site. The capability allows the lineage harvester to retrieve data from your data source.

4 Synchronize the technical lineage.

You can synchronize your technical lineage manually or automatically by adding a synchronization schedule.

Integrate via Shared Storage connection

# Step Description
1 Review the preflight checks. Key considerations to help ensure successful integration, including required Edge, technical lineage, and data source-specific permissions, network requirements and more.
2 Prepare and store your SQL files in a directory. You need to provide SQL files that include your SQL queries. Collibra Data Lineage processes the metadata based on your queries to create the technical lineage.
3

Create a Shared Storage connection.

For best technical lineage results, use the JDBC connection to ingest JDBC sources when possible, instead of using the Shared Storage connection with SQL files.

Shared Storage connection is not supported for Collibra Cloud sites.

4

Add the Redshift technical lineage capability for Shared Storage connections.

Add the technical lineage capability to your Edge or Collibra Cloud site. The capability allows the lineage harvester to retrieve data from your data source.
5 Synchronize the technical lineage.

You can synchronize your technical lineage manually or automatically by adding a synchronization schedule.

Integrate via Cloud Storage connection

# Step Description
1 Review the preflight checks. Key considerations to help ensure successful integration, including required Edge, technical lineage, and data source-specific permissions, network requirements and more.
2 Prepare and store your SQL files in your cloud-based storage system.

You need to provide SQL files that include your SQL queries. Collibra Data Lineage processes the metadata based on your queries to create the technical lineage.

Your SQL files must be stored in one of the following:

  • An AWS S3 bucket.
  • An Azure Data Lake Storage container.
  • A Google Cloud Storage bucket.
3

Create a Cloud Storage connection.

For guidance on how to create a connection between your cloud-based storage system and Edge or Collibra Cloud site, go to the appropriate topic:

  • Amazon Redshift: Create an AWS connection to an Edge or Collibra Cloud site
  • Amazon Redshift: Create an Azure Data Lake Storage connection to an Edge or Collibra Cloud site
  • Amazon Redshift: Create a Google Cloud Platform connection to an Edge or Collibra Cloud site

4

Add the Redshift technical lineage capability for Cloud Storage connections.

Add the technical lineage capability to your Edge or Collibra Cloud site. The capability allows the lineage harvester to retrieve data from your data source.
5 Synchronize your technical lineage.

You can synchronize your technical lineage manually or automatically by adding a synchronization schedule.

What's next

After you synchronize the technical lineage, you can view the ingestion report. This shows the impact of technical lineage synchronization on the assets in Collibra.

Helpful resources