About Collibra Data Lineage
Collibra Data Lineage is a cloud-only product that maps the entire data lifecycle, allowing you to visualize and audit the flow of data from source systems to downstream target systems. It is designed to help you establish trust in your reports and use the data to make sound business decisions.
Collibra Data Lineage consists of both technical lineage and business lineage. The value of technical lineage and business lineage are the same, but they are designed for different audiences. The main difference is that:
- Technical lineage identifies data objects in your external data sources.
- Business lineage show assets in Collibra that represent some or all of those data objects.
For a complete list of supported data sources, go to Supported data sources for technical lineage. If you want to create a technical lineage for a data source that is not currently supported, you can create a Custom technical lineage.
Technical lineage
Technical lineage is designed for Data Engineers, Data Architects, and other technical stewards. It is a detailed lineage graph that provides complete end-to-end lineage, to visualize the journey of the data objects, including temporary tables and columns, in your external data sources. It includes all source code and data transformation details, so that you can identify in which system data objects are used and how they are transformed from data source to data source.
Asset types
You can view a technical lineage for the following asset types:
- Table
- Column
- Looker Look
- MicroStrategy Report
- MicroStrategy Dossier
- MicroStrategy Data Attribute
- Power BI Report
- Power BI Table
- Power BI Column
- SSRS Report
- SSRS Table
- SSRS Column
- Tableau Worksheet
- Tableau Data Attribute
After creating a technical lineage, you can view it by clicking the Technical lineage tab on an eligible asset page.
- Catalog
- Technical Lineage
Data objects
There are two types of data objects in your technical lineage:
- Data objects from your data source that are stitched to assets in Data Catalog and for which you created the technical lineage. The successful stitching of data objects and their corresponding assets in Data Catalog is denoted by the yellow icons in the technical lineage. Example

- Other objects, such as temporary tables and columns, that are collected from your data sources but are not stitched to assets in Data Catalog. These objects have gray icons. For example, the objects in node 1 in the following technical lineage graph are not stitched, meaning they don't have corresponding assets in Data Catalog. Example

- Data objects from your data source that are stitched to assets in Data Catalog and for which you created the technical lineage. The successful stitching of data objects and their corresponding assets in Data Catalog is denoted by the yellow background in the technical lineage. Example

- Other objects, such as temporary tables and columns, that are collected from your data sources but are not stitched to assets in Data Catalog. These objects have a gray background. For example, the objects in node 1 in the following technical lineage graph are not stitched, meaning they don't have corresponding assets in Data Catalog. Example

Example technical lineage
Let's say that you have created a technical lineage for four different databases:
- The first database, Oracle, is not registered in Collibra, therefore there are no assets in Data Catalog that represent the Oracle data objects.
- The second database, Raw, is registered in Collibra.
- The yellow background of the first node indicates that Table and Column assets that were created in Data Catalog are stitched to their corresponding data objects in the Raw database.
- The other node, the one with the gray background, is a temporary table. No assets are created for temporary data objects and so stitching is not relevant. That is why the node has a gray background.
- The third and fourth databases, Refined and Consumption, are ingested in Collibra. The assets that were created in Data Catalog are stitched to their corresponding data objects in the two databases.
What we want to point out here is that Technical lineage shows the data flow of all data objects across all four databases, regardless of any assets in Collibra.
The corresponding business lineage shows only the relations between data objects that have corresponding assets in Data Catalog. In the following image, we see the data flow of assets from the second database, to the third, to the fourth. The first database, Oracle, which is not registered in Collibra, and , is not shown on the diagram.
Business lineage
Business lineage is designed for analysts, governance roles, and other business stewards. A business lineage shows the relations between assets in Data Catalog that represent the data objects in your external data sources. More specifically, it is a diagram that includes relations of the type "Data Element sources / targets Data Element":
- Between Column assets of registered data sources.
- If you integrated one of the supported BI tools, between BI assets and assets of registered data sources.
Business lineage allows you to trace data flows between registered databases. As such, it provides a summary of a technical lineage.
The following image depicts an example business lineage. Notice the "Data Element sources / targets Data Element" relation between columns belonging to 3 different tables.
Tip Be sure to check out the training course From business lineage to insight, in Collibra University.
Automatically created
Business lineage is automatically created as part of the technical lineage process.
During the lineage generation process, the Collibra Data Lineage service instance automatically pushes relations of the type "Data Element sources / targets Data Element" to the Collibra Platform.
BI tool integration
Business intelligence software helps organizations to collect data from the various data sources across their data ecosystem and present the data in interactive dashboards and reports, to facilitate decision-making and strategic planning.
When you integrate your BI tool in Collibra:
- Metadata about the data objects in your external data sources is created as BI assets in Collibra.
- Relations are created:
- Between data objects (such as columns and tables) in your external data source and their corresponding assets in Data Catalog (such as Column and Table assets).Note These assets are created when the data source is registered. For supported BI tools, registration is automatically carried out during the technical lineage process.
- Between BI assets (such as Tableau worksheets and Power BI reports) and their corresponding assets in Data Catalog (such as Tableau Worksheet and Power BI Report assets).
- Between data objects (such as columns and tables) in your external data source and their corresponding assets in Data Catalog (such as Column and Table assets).
- Both technical lineage and business lineage are automatically created.
Report views
Collibra Data Lineage enables you to find all ingested BI asset types in a single location.
In the Reports tab page in Data Catalog you can see an overview of all BI Report assets and their children. Optionally, you can create a view with a filter to only show, for example, Tableau assets. This is useful if you quickly want to see all reports or if you want find specific reports, for example certified reports or the most frequented reports.
Business value
Collibra Data Lineage has many important use cases. Here are a few.
By providing transparency and traceability to the data used in a report, data lineage plays a foundational role in the report certification process:
- Review data sources and transformations associated with the data in a report, to help ensure accuracy and reliability.
- Identify the original sources of data used in the report, and how the data moves from the source system to intermediate systems.
- View and analyze the calculation rules that are used to extract and transform the data before it reaches the report.
All critical metadata is ingested during BI integration and shown on the Collibra asset pages. This includes information like data timestamps, quality metrics, data ownership, and other valuable attributes that help you to assess the reliability and quality of the data.

You can manually synchronize the data in Collibra or set up a synchronization schedule, to help ensure the accuracy and completeness of the data over time. This can help identify inconsistencies or gaps in the data flow and transformation processes.
Collibra Data Lineage can help you with impact analysis when making changes to data sources, adjusting the calculation rules that drive transformations, migrating data and more. It can help you assess the potential impact of changes on downstream systems, data and reports.
Example Let's say you have data in a Snowflake data source, and you need to move everything to Databricks. After migration, you can create a technical lineage to trace the movement of data from one data source to the other and ensure data integrity throughout the migration process.
Understanding data dependencies and relationships helps you to:
- Anticipate which downstream systems could be impacted if you've made changes to a data source or calculation rule.
- Anticipate how changes to a particular data object or system will propagate across your data landscape.
- Minimize risks and make better informed decisions.
Collibra Data Lineage is a valuable tool for helping data analysts and engineers trace the source of data quality issues and anomalies. When you detect a discrepancy in your data, you can examine the lineage and source code to:
- Trace the issue back to the source system or process that is causing the problem.
- Analyze any calculations rules that might have affected the consistency or quality of the data.
- Identify how the issue is affecting downstream systems and reporting.
This can help you identify potential areas where the root cause might exist.
Compliance with data privacy regulations such as GDPR and CCPA, and various security, auditing and reporting standards, often requires organizations to show end-to-end traceability across their data landscape. In the data privacy context, Collibra Data Lineage can give you a complete view of where sensitive and restricted data is processed, shared, and stored.
- Trace the information across its systems, data source and processes.
- Monitor any migrations and transformations to the data.
- Identify who has access to the systems and data sources that consume the data.
BI tool integration in Collibra enables you to view all of the critical metadata about your reports and dashboards on dedicated asset pages in Data Catalog. The many attributes help you to identify the most critical reports that have the highest impact. This can help you effectively allocate your resources and minimize disruptions.
A few of the key attributes include the following:
- Document creation and modification dates: See when the report was created and updated in your BI tool.
- Visits count: See how many people have viewed the report. Let's say that you have two reports with the same name, but one has 400 views and the other has almost none. That gives a strong indication as to which is the more helpful report.
- Owner in Source: Identify who owns and who certified a report, to know where to turn for additional help and information
- Calculation Rule: See DAX calculations for calculated columns and measures on Power BI Column asset pages.
- URL: Access the report in your BI tool.
- Relation types allow you to immediately identify in which other reports a report is used.
How to create a technical lineage
There are two ways to create a technical lineage and business lineage:
For details about the typical workflow, go to About technical lineage via Edge.
Summary of differences between technical and business lineage
|
Business lineage |
Technical lineage |
|---|---|
|
Allows Business Analysts and other business stewards to view relations between assets in Data Catalog that represent the data objects in external data sources. |
Allows Data Engineers, Data Architects and similar personas to view the flow of data objects in external data sources. |
|
Accessible via the Diagram tab on all asset pages. |
Accessible via the Technical Lineage tab pane of all Column and Table assets, and some BI assets. |
|
Shows relations of the type "Data Element targets / sources Data Element" between assets that exist in Data Catalog. Warning During the ingestion process, relations of the type "Data Element targets / sources Data Element" are automatically created between certain assets. Any relations of this type that you manually create between assets will be deleted during the synchronization process. If you want to manually create such relations and ensure that they are maintained, you can create a custom technical lineage.
|
Shows relations of the type "Data Element targets / sources Data Element" between all data objects in the external data source. Note Temporary tables and columns that the lineage scanner collected from your data sources, but that are not assets in Data Catalog are also included in the technical lineage. |
|
Shows how assets in Collibra from registered data sources relate to each other. Supported BI and ETL tools are automatically registered during the process of generating a technical lineage. |
Shows how data objects from data sources for which you create a technical lineage relate to each other, regardless of whether the data source is registered. |
Dependencies
A dependency is a data object that is targeted by another data object. This is represented by a relation of the type "Data Element targets / sources Data Element", where the dependency is the tail.
Direct dependency
A direct dependency is a data object that is the tail of a relation of the type "Data Element targets / sources Data Element.
Indirect dependency
An indirect dependency is a data object that is the target of a direct or another indirect dependency.