Automatic stitching for technical lineage
Stitching is a process that creates relations between assets and data objects representing the same data source. More specifically, stitching creates relations between the following assets:
- The assets that were created when you prepared Data Catalog's physical data layer for a data source; and
- The data objects in the same data source for which you created a technical lineage and that represent the assets in Data Catalog.
For Collibra Data Lineage to stitch the assets to the data objects, you must prepare the Data Catalog physical data layer to create the database > schema > table > column or system > database > schema > table > column hierarchy. Note that when a table in your data source has a schema and a file as its parents, Collibra Data Lineage uses the schema as the parent for stitching.
When the data sources are scanned, Collibra Data Lineage service automatically creates and pushes new relations of the type "Data Element targets / sources Data Element":
- Between data objects in your data source and assets from registered data sources.
- Between ingested assets from BI sources and Data Catalog assets from registered data sources.
Example To clarify, in the case of Tableau integration, the Tableau Data Attribute is the target of the Column and the Column is the source of the Tableau Data Attribute.
Note If you don't prepare the Data Catalog physical data layer, Data Catalog creates a technical lineage without stitching. As a result, when you click the Technical lineage tab on any Column, Table, Tableau Data Attribute, Power BI Column or SSRS Column asset page, you get the message The current asset doesn't have a technical lineage yet. However, you can use the Browse tab pane to view the technical lineage of data objects in data sources for which you created the technical lineage.
Stitching issues
To stitch assets in Data Catalog to data objects collected by the lineage harvester, the Collibra Data Lineage service looks at the full path of the assets in Data Catalog and the full path of data objects in your data source. Stitching is based on the full path of objects with the following structure: (system) > database > schema > table > column. If the full paths match, the Collibra Data Lineage automatically stitches the data objects to the existing assets in Data Catalog. To indicate this, the assets have a yellow background in the technical lineage graph. Note that in Collibra, full paths are case-sensitive.
If the full path of an asset in Data Catalog does not match (including for case-sensitivity) the full path of a data object in your data source, Collibra Data Lineage cannot stitch them. To indicate this, the data objects have a gray background in your technical lineage graph. To fix stitching issues, you must check the full path of the assets in Data Catalog and make sure they match the full path of the data objects that are shown in the technical lineage graph. If you change the full path, make sure to run the lineage harvester again. Note that in Collibra, full paths are case-sensitive.
- Does not support stitching for Looker assets.
- Supports stitching for MicroStrategy assets only if you use the new integration method (beta), which supports the latest MicroStrategy APIs.
Tip You can use the Stitching tab page to easily find the full path of assets in Data Catalog and data objects that were collected by the lineage harvester. The Stitching tab page also shows an overview of all assets and data objects that are stitched successfully.