Supported data sources for technical lineage

Warning The CLI lineage harvester is now deprecated and will officially reach its end-of-life on July 31, 2026. To ensure a smooth transition, we encourage you to begin creating technical lineage via Edge, if you haven't already.

Collibra Platform supports many data sources and metadata sources, including JDBC data sources, ETL tools and BI tools, for which you can create a technical lineage.

Note Using an older version of a data source might not work as expected; however, we don't expect problems if you use a newer version.

JDBC data sources

The following tables show the supported JDBC data sources.

The following table lists the supported JDBC data sources and connection types you can use when you add capabilities for different data sources. The Shared Storage connection is equivalent to the folder connection type when you use the lineage harvester.

Important Column-level lineage is not generated for tables that are created by SQL statements, unless you provide the SQL statements by creating a shared storage connection.

JDBC data source type

Supported versions

Connection type

Scope

Steps to create technical lineage
Amazon Redshift

1.2.34.1058 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input and stored procedures.

Go to Integration steps overview.
Azure SQL Server

Newest version

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input and stored procedures.

Note Technical lineage cannot be created for views and procedures if the SQL definitions are encrypted in the database.
Go to Integration steps overview.
Azure Synapse Analytics

Newest version

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection
  • SQL based input and stored procedures.
  • Dedicated SQL pool and serverless SQL pool.
Note Technical lineage cannot be created for views and procedures if the SQL definitions are encrypted in the database.
Go to Integration steps overview.
Databricks Unity Catalog
Newest version Databricks connection

Lineage information from the lineage system tables.

For more information about supported transformations for Databricks Unity Catalog, go to Supported transformation details.

Go to Integration steps overview.
Google BigQuery

Newest version

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection
  • SQL based input.
  • Views.
  • Complex structures (tables or columns) are not supported.
Go to Integration steps overview.
Google Dataplex

Newest version

Google Cloud Platform (GCP) connection

Collibra Data Lineage retrieves the lineage information from Dataplex via the Dataplex Data Lineage API, to generate technical lineage.

Go to Integration steps overview.
Greenplum

6.10 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input.

Go to Integration steps overview.
HiveQL (SQL-like statements)

2.3.5 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input and connection via an AWS host.

Go to Integration steps overview.
IBM Db2

11.5 and newer

Important Support is only for Db2 for LUW.

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input without stored procedures.

Go to Integration steps overview.
Oracle

11g, 12c and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input and stored procedures.

Go to Integration steps overview.
PostgreSQL

9.4, 9.5 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input and stored procedures.

Go to Integration steps overview.
Microsoft SQL Server

2014, 2016 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input and stored procedures.

Note Technical lineage cannot be created for views and procedures if the SQL definitions are encrypted in the database.
Go to Integration steps overview.
MySQL

5.7, 8 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input without stored procedures.

Go to Integration steps overview.
Netezza

7.2.1.0 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input without stored procedures.

Go to Integration steps overview.
SAP HANA Classic on-premises and SAP HANA Cloud/Advanced
  • 2.00.40 and newer for SAP HANA Classic on-premises
  • 4/2023 and newer for SAP HANA Cloud/Advanced
  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection
  • SQL-based input.
  • SAP HANA Information views, which includes attributes, analytic views and calculation views from database table or view data sources.
  • Script-based calculation views and stored procedures are out of scope.
Go to Integration steps overview.
Snowflake

Newest version

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection
  • SQL based input and stored procedures.
  • SQL-API based input and stored procedures.
  • Ingested Snowflake queries that use a function as a source are analyzed and included in the technical lineage.

For more information, go to Snowflake ingestion methods.

Go to Integration steps overview.
Spark SQL

2.4.3 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL-based input without stored procedures and connection via an AWS host.

For Spark SQL data source, we recommend using the folder connection type to connect to the directory with your SQL queries.

Go to Integration steps overview.
Sybase Adaptive Server Enterprise

16.0 SP02 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input without stored procedures.

Go to Integration steps overview.
Teradata

15.0, 16.20.07.01 and newer

  • JDBC connection
  • Shared Storage connection
  • Cloud Storage connection

SQL based input, including BTEQ scripts.

Go to Integration steps overview.

The following table shows the supported JDBC data sources and driver versions that have been tested. You can connect to them via a JDBC driver or by using the folder connection method.

Important Column-level lineage is not generated for tables that are created by SQL statements, unless you provide the SQL statements by using the folder connection method. For more information, go to Prepare the lineage harvester configuration file.

JDBC data source type

Supported versions

Connection type

Scope

Steps to create technical lineage
Amazon Redshift

1.2.34.1058 and newer

JDBC, Folder

SQL based input and stored procedures.

Create technical lineage for Amazon Redshift by using the lineage harvester.
Azure SQL Server

Newest version

JDBC, Folder

SQL-based input and stored procedures.

Note Technical lineage cannot be created for views and procedures if the SQL definitions are encrypted in the database.
Create technical lineage for Azure SQL server by using the lineage harvester.
Azure Synapse Analytics

Newest version

JDBC, Folder

SQL-based input and stored procedures.

Note Technical lineage cannot be created for views and procedures if the SQL definitions are encrypted in the database.
Create technical lineage for Azure Synapse Analytics by using the lineage harvester.
Google BigQuery

Newest version

JDBC, Folder

  • SQL based input.
  • Views.
  • Complex structures (tables or columns) are not supported.
Create technical lineage for Google BigQuery by using the lineage harvester.
Greenplum

6.10 and newer

JDBC, Folder

SQL-based input without stored procedures.

Create technical lineage for Greenplum by using the lineage harvester.
HiveQL (SQL-like statements)

2.3.5 and newer

JDBC, Folder

SQL-based input and connection via an AWS host. Stored procedures are not supported.

Create technical lineage for HiveQL by using the lineage harvester.
IBM Db2

11.5 and newer

JDBC, Folder

SQL-based input without stored procedures.

Create technical lineage for IBM Db2 by using the lineage harvester.
Oracle

11g, 12c and newer

JDBC, Folder

SQL-based input and stored procedures.

Create technical lineage for Oracle by using the lineage harvester.
PostgreSQL

9.4, 9.5 and newer

JDBC, Folder

SQL-based input without stored procedures.

Create technical lineage for PostgreSQL by using the lineage harvester.
Microsoft SQL Server

2014, 2016 and newer

JDBC, Folder

SQL-based input and stored procedures.

Note Only Basic Authentication is supported. NTLM authentication, for example, is not.

Note Technical lineage cannot be created for views and procedures if the SQL definitions are encrypted in the database.
Create technical lineage for Microsoft SQL Server by using the lineage harvester.
MySQL

5.7, 8 and newer

JDBC, Folder

SQL-based input without stored procedures.

Create technical lineage for MySQL by using the lineage harvester.
Netezza

7.2.1.0 and newer

JDBC, Folder

SQL-based input without stored procedures.

Create technical lineage for Netezza by using the lineage harvester.
SAP HANA Classic on-premises and SAP HANA Cloud/Advanced
  • 2.00.40 and newer for SAP HANA Classic on-premises
  • 4/2023 and newer for SAP HANA Cloud/Advanced

JDBC, Folder

  • SQL-based input.
  • SAP HANA Information views, which includes attributes, analytic views and calculation views from database table or view data sources.
  • Script-based calculation views and stored procedures are out of scope.
Create technical lineage for SAP HANA by using the lineage harvester.
Snowflake

Newest version

JDBC, Folder

  • SQL-based input with stored procedures.
  • SQL-API-based input with stored procedures.

For more information, go to Snowflake ingestion methods.

Create technical lineage for Snowflake by using the lineage harvester.
Spark SQL

2.4.3 and newer

JDBC, Folder

SQL-based input and connection via an AWS host. Stored procedures are not supported.

For Spark SQL data source, we recommend using the folder connection type to connect to the directory with your SQL queries.

Create technical lineage for Spark SQL by using the lineage harvester.
Sybase Adaptive Server Enterprise

16.0 SP02 and newer

JDBC, Folder

SQL-based input without stored procedures.

Create technical lineage for Sybase Adaptive Server Enterprise by using the lineage harvester.
Teradata

15.0, 16.20.07.01 and newer

JDBC, Folder

SQL-based input and stored procedures, including BTEQ scripts.

Create technical lineage for Teradata by using the lineage harvester.

Authentication for JDBC data sources

Collibra Data Lineage supports the following means of authentication:

  • For all data source types, except for external directories: username and password.
  • Google BigQuery: username and password or a service account key file.
  • Snowflake: username and password or key pair authentication.

ETL tools

The following table shows the supported ETL tools.

The following table lists the supported ETL data sources and connection types you can use when you add capabilities for different data sources. The Shared Storage connection is equivalent to the folder connection type when you use the lineage harvester.

Note Indirect lineage, as described in the topics The technical lineage graph and Technical lineage Settings tab pane, is only available when working with JDBC data sources and the Lookup transformation for IBM DataStage.

ETL tool

Supported versions

Connection type

Scope

Steps to create technical lineage
Apache Airflow
2.7 or newer
  • Shared Storage connection
  • Cloud Storage connection

Commonly supported transformations and activities in Airflow.

Collibra Data Lineage creates technical lineage for Airflow by leveraging the OpenLineage Airflow integration.

To create this technical lineage, we recommend using Fluentd. Fluentd is a third-party, open-source tool maintained by the community. While we provide documentation to help configure the integration, Collibra support is limited to Collibra-side configuration and does not cover troubleshooting the Fluentd environment.

Go to Integration steps overview.
AWS Glue

N/A

  • Shared Storage connection
  • Cloud Storage connection

Commonly supported transformations and activities in AWS Glue. For more information, go to Supported transformation details for AWS Glue via OpenLineage.

Collibra Data Lineage creates technical lineage for AWS Glue via the OpenLineage Spark integration.

To create this technical lineage, we recommend using Fluentd. Fluentd is a third-party, open-source tool maintained by the community. While we provide documentation to help configure the integration, Collibra support is limited to Collibra-side configuration and does not cover troubleshooting the Fluentd environment.

Go to Integration steps overview.
Azure Data Factory
2 and newer API

Commonly supported transformations and activities in Azure Data Factory, including parameterized pipelines. For details, go to Supported transformation details.

Go to Integration steps overview.
dbt
1.4 or newer
  • dbt Cloud: API
  • dbt Core:
    • Shared Storage connection
    • Cloud Storage connection

Commonly supported model types in dbt. For details, go to:

For dbt Core, you have to prepare the data source files with all data objects that you want to process.

Go to Integration steps overview for dbt Cloud.

Go to Integration steps overview for dbt Core.

IBM InfoSphere DataStage

11.5 and newer

  • Shared Storage connection
  • Cloud Storage connection

Commonly used DataStage ETL components including SQL overrides and transformation details.

You have to prepare the data source files with all data objects that you want to process.

You have to prepare the data source files with all data objects that you want to process.

Go to Integration steps overview.
Informatica Intelligent Cloud Services, specifically Cloud Data Integration

Tip Data Integration is one of the Informatica Intelligent Cloud services.

Cloud, newest only

Informatica Intelligent Cloud Services (IICS) connection

Note Collibra Platform 2023.03 or newer is required to use the Informatica Intelligent Cloud Services (IICS) connection.

Commonly used transformations in Informatica Intelligent Cloud Services: Data Integration, including SQL overrides.

Supported data sources are locally stored flat files and databases.

Go to Integration steps overview.
Informatica PowerCenter

9.6 and newer

  • Shared Storage connection
  • Cloud Storage connection

Commonly used transformations in Informatica PowerCenter, including SQL overrides.

You have to prepare the data source files with all data objects that you want to process..

Go to Integration steps overview.
Matillion

Newest version

Matillion connection

Note Collibra Platform 2023.03 or newer is required to use the Matillion connection.

SQL based input without stored procedures.

Technical lineage via Edge can only access Redshift and Snowflake projects.

Go to Integration steps overview.
OpenLineage
N/A
  • Shared Storage connection
  • Cloud Storage connection

Commonly supported transformations and activities. For more information, go to Supported transformation details for OpenLineage .

To create this technical lineage, we recommend using Fluentd. Fluentd is a third-party, open-source tool maintained by the community. While we provide documentation to help configure the integration, Collibra support is limited to Collibra-side configuration and does not cover troubleshooting the Fluentd environment.

Go to Integration steps overview.
SQL Server Integration Services (SSIS)

2012 and newer

Package format version 6 or newer.

  • Shared Storage connection
  • Cloud Storage connection

All commonly used transformations in SSIS, data flows and mappings, including SQL overrides.

Important SQL statements from Excel are not supported.

You have to prepare the data source files with all data objects that you want to process.

Go to Integration steps overview.

The following table shows the supported ETL tools and driver versions that have been tested. You can connect to them via an API or by creating a folder.

Note Indirect lineage, as described in the topics The technical lineage graph and Technical lineage Settings tab pane, is only available when working with JDBC data sources and the Lookup transformation for IBM DataStage.

ETL tool

Supported versions

Connection type

Scope

Steps to create technical lineage
Azure Data Factory
2 and newer API Commonly supported transformations and activities in Azure Data Factory. For details, go to Supported transformation details. Create technical lineage for Azure Data Factory by using the lineage harvester.
dbt
1.4 or newer API for dbt Cloud

Folder for dbt Core

Commonly supported model types in dbt. For details, go to:

For dbt Core, you have to prepare a folder with all data objects that you want to process.

Create technical lineage for dbt Cloud by using the lineage harvester.

Create technical lineage for dbt Core by using the lineage harvester.

IBM InfoSphere DataStage

11.5 and newer

Folder

Commonly used DataStage ETL components including SQL overrides and transformation details.

Collibra Data Lineage supports IBM InfoSphere DataStage transformation logic.

You have to prepare a folder with all data objects that you want to process.

Create technical lineage for DataStage by using the lineage harvester.
Informatica Intelligent Cloud Services, specifically Cloud Data Integration

Tip Data Integration is one of the Informatica Intelligent Cloud services.

Cloud, newest only

API

Commonly used transformations in Informatica Intelligent Cloud Services: Data Integration, including SQL overrides.

Supported data sources are locally stored flat files and databases.

Create technical lineage for Informatica Intelligent Cloud Services by using the lineage harvester.
Informatica PowerCenter

9.6 and newer

Folder

Commonly used transformations in Informatica PowerCenter, including SQL overrides.

You have to prepare a folder with all data objects that you want to process.

Create technical lineage for Informatica PowerCenter by using the lineage harvester.
Matillion

Newest version

API

SQL based input without stored procedures.

The lineage harvester can only access Redshift and Snowflake projects.

Create technical lineage for Matillion by using the lineage harvester.
SQL Server Integration Services (SSIS)

2012 and newer

Package format version 6 or newer.

Folder

All commonly used transformations in SSIS, data flows and mappings, including SQL overrides.

Important SQL statements from Excel are not supported.

You have to prepare a folder with all data objects that you want to process.

Create technical lineage for SQL Server Integration Services by using the lineage harvester.

BI tools

The following table shows the supported BI tools.

The following table lists the supported BI data sources and connection types you can use when you add capabilities for different data sources.

Note Indirect lineage, as described in the topics The technical lineage graph and Technical lineage Settings tab pane, is only available when working with JDBC data sources and the Lookup transformation for IBM DataStage.

BI tool

Tested versions

Connection type

Capability

Steps to create technical lineage
Looker
Newest API Technical Lineage for Looker Go to Integration steps overview.
MicroStrategy

Newest

Note Freeform SQL is supported for reports (not cubes or dossiers) if you have MicroStrategy update10 or newer, or MicroStrategy ONE.

API

Technical Lineage for MicroStrategy Go to Integration steps overview.
Power BI

Newest

Note DirectQuery and live connections are not supported.

API

Technical Lineage for Power BI Go to Integration steps overview.

Qlik
Newest API Technical Lineage for Qlik Go to Integration steps overview.

Note Metadata and asset details are ingested, but technical lineage is not yet available.

SSRS-PBRS
  • SSRS: 2017 and newer
    Note Due to a bug in 2017 that is resolved by the newer APIs, we recommend using SQL Server 2019 or newer Reporting Services.
  • PBRS: 2019 and newer
API Technical Lineage for SSRS-PBRS Go to Integration steps overview.
Tableau
 
Tableau Prep Builder is not supported

Newest

Live Connection and Extract are both supported.

API

Technical Lineage for Tableau Go to Integration steps overview.

The following table shows the supported BI tools.

Note Indirect lineage, as described in the topics The technical lineage graph and Technical lineage Settings tab pane, is only available when working with JDBC data sources and the Lookup transformation for IBM DataStage.

BI tool

Tested versions

Connection type

Steps to create technical lineage
Looker

Newest

API

Collibra Data Lineage automatically creates a technical lineage, but stitching is not available.

You have to prepare a lineage harvester configuration file for Looker ingestion.

Create technical lineage for Looker via the CLI lineage harvester (deprecated).
MicroStrategy

Newest

Note Freeform SQL is supported for reports (not cubes or dossiers) if you have MicroStrategy update10 or newer, or MicroStrategy ONE.

You have to prepare a lineage harvester configuration file for MicroStrategy ingestion.

Benefits of the new integration method include:
  • Support for the latest MicroStrategy APIs
  • Support for technical lineage and stitching.
  • New operating model.
  • No longer dependent on a direct connection to the repository.
Create technical lineage for MicroStrategy via the lineage harvester (deprecated)
Power BI

Newest

Note DirectQuery and live connections are not supported.

API

You have to prepare:

Collibra Data Lineage supports:

  • Power BI on the Microsoft Power Platform.
  • Power BI on Fabric.
The configuration requirements and the integration are the same, regardless of your setup.

Create technical lineage for Power BI via the lineage harvester (deprecated)
SSRS-PBRS

  • SSRS: 2017 and newer
    Note Due to a bug in 2017 that is resolved by the newer APIs, we recommend using SQL Server 2019 or newer Reporting Services.
  • PBRS: 2019 and newer

API

You have to prepare:

Important There are known limitations to the metadata returned by the API when integrating PBRS. For example, Power BI reports in PBRS are ingested as Power BI Report assets in Data Catalog, but there is no technical lineage for the reports.

Create technical lineage for SSRS or PBRS via the lineage harvester (deprecated)
Tableau
 
Tableau Prep Builder is not supported

Newest

Live Connection and Extract are both supported.

API

You have to prepare:

Create technical lineage for Tableau via the lineage harvester (deprecated)

Custom technical lineage

You can create a custom technical lineage to include data objects from data sources that are not listed above.

For information on creating a custom technical lineage via Edge, go to About custom technical lineage.