Technical overview of BI tool lineage

This topic provides information about the technical lineage that is created when you ingest BI tool metadata in Data Catalog.

For a business perspective, see Technical lineage and stitching for BI tool integrations.

Tip

Select a BI tool.

Currently, the information is shown for:

My data source is not in this list.

Looker
MicroStrategy
Power BI
SSRS-PBRS
Tableau

When you ingest Tableau metadata in Data Catalog, a technical lineage for Tableau Data Attribute assets is automatically created.

Permissions

If you have a Data Catalog global role with the Catalog and Technical lineage global permissions, you can see the technical lineage of Tableau assets by clicking on the Technical lineage tab on the asset page of either of the following asset types:

Tableau Data Attribute
Tableau Worksheet

Technical lineage graph

The technical lineage graph shows relations of the type "Data Element sources / targets Data Element" between Tableau assets and other data objects in the data flow, for example between a Column asset and a Tableau Data Attribute asset. These relations are created during the Tableau ingestion process as a result of automatic stitching.

Note If you use a Tableau <source ID> configuration file and don’t specify a value for the relevant collibraSystemName property, the designation “UNDEFINED” will be shown in the technical lineage.

Note If you use custom SQL that is not supported by the Tableau metadata API, the technical lineage might not be complete. For complete information, see the Tableau documentation on Tableau Catalog support for custom SQL and Tableau Lineage and custom SQL connections.

Example

The following technical lineage shows how data flows from a PostgreSQL data source to Tableau. It shows relations of the type "Data Element sources / targets Data Element" between the Column assets of the database and Tableau Data Attribute assets in Tableau. For example, Column asset L_RETURNFLAG has a relation of the type "Data Element sources / targets Data Element" to the Tableau Data Attribute assets Quantity and Adjusted Quantity.

UUIDs in the Tableau technical lineage

Collibra Data Lineage uses unique full names to create a technical lineage and stitch objects within the technical lineage. Full names in Collibra are constructed in accordance with the hierarchy of data objects in Tableau, for example:

Server > Site > Project > Workbook > Worksheet > Field
Server > Site > Project > Workbook > Data Model > Column

In Collibra, every node in this hierarchy must have a unique name. However, in Tableau, the names of data objects do not have to be unique. As such, if Tableau data objects in a technical lineage hierarchy have the same full name, Collibra Data Lineage adds the UUIDs of the corresponding assets to the names in the technical lineage, to maintain uniqueness.

In the following example image, the names of the assets Priority, Opened and Active in the technical lineage have been appended with their UUIDs.

Note

UUIDs are not added to the names of the assets themselves; they are only added to the names of the data objects in the technical lineage.
The UUID is always part of the full name of an asset, regardless of whether or not it is a duplicate.

How to resolve UUIDs in names in a technical lineage

To keep Collibra Data Lineage from adding UUIDs to the names of the data objects in a technical lineage, ensure that the names of all fields and columns in Tableau are unique.

Generally, Tableau doesn't allow you to create two fields or columns with the same name. However, hierarchy fields and non-hierarchy fields can have the same name. Duplication of names can also happen if:

A Tableau worksheet is using two different data sources that have columns with the same name.
You create a virtual connection that contains multiple data sources that have columns with the same name.
There are multiple data sources in Tableau with the same name.

Sources tab page

The Sources tab page shows, for each Tableau data source and Tableau Worksheet, the transformation and calculation rules that the Collibra Data Lineage service analyzed and processed, and the results of the analysis. It also shows the TECHLIN VIEW query definitions, based on custom SQL queries.

If a parameter is used in a Tableau worksheet, it is shown in the worksheet source code, for example:

PARAMETERS: 'parameter1'.

If a parameter is used in a calculation rule, it is also shown under the Tableau data source for data sources in the calculation rule, for example:

CALCULATION RULE: '[List price]/[parameter1]'

The success rate of the analysis indicates how complete the technical lineage is. There are a few limitations that prevent the Collibra Data Lineage service from processing all Tableau metadata.

Important The Collibra Data Lineage service might not be able to process all complex Tableau metadata. This means that the success rate of a Tableau ingestion might not be 100%.

Note

With the implementation of SQL extension v2 in Collibra 2024.01.1, queries and their relevant assets are now combined into a single line in the transformations table. Because of this, the source code processed count in the Done column is reduced. The lineage itself is unchanged.

Error codes

The Errors summary represents a summary of all errors per Tableau site. The continue on error feature allows for continuous processing of an import or synchronization job, even if one or more commands fail.

Warning codes

Warning codes indicate:

Issues that might affect the technical lineage, but do not stop the processing.
Issues that you can resolve.

Element	Description
ID	The warning ID number.
Name	The name of the warning. Possible values are: Empty name Field relation not found Parent project not found Parent workbook not found Parent database not found Datasource not found Worksheet not found REST datasource not found Not found in remote fields Multiple datasources Query parsing error Invalid Collibra system names Invalid hostname mapping
Status code	The status label. The value is always WARNING.
Status description	Identifies a grouping of warnings. Warnings of the same type (meaning they have the same group name and name) are grouped together in "parts" of up to 100 warnings. Example In this example, there are 250 Configuration > Invalid Collibra system names warnings, grouped into parts 1, 2 and 3:
Group name	The type, or category, of warning. Possible values are: Configuration Mismatched ID Missing content

The following table shows the complete set of warning codes, by group and name.

Group name	Name	Description
Missing content	Empty name	Raised during the processing of databases, tables, columns, worksheets and dashboards. Contains the following lines: `Database with id DATABASE_ID is skipped` `Table with id TABLE_ID is skipped` `Column with id COLUMN_ID is skipped` `Worksheet with id WORKSHEET_ID is skipped` `Dashboard with id DASHBOARD_ID is skipped` Indicates that the `name` property of a database, table, column, worksheet or dashboard, which has a specified value for the `id` property, has a `null` value or it is empty: Example for a database: { "data": { "databasesConnection": { "nodes": [ { "id": "DATABASE_ID", "name": null, ... Note The `name` property is considered empty if the value is `null` or if it is empty.
Missing content	Parent database not found	Raised during the processing of tables. Contains the line: `Table with id TABLE_ID is skipped` Indicates that the parent database for a table with `TABLE_ID` was not found in the previously processed databases. Possible causes: The `database` property is not present in the JSON file. The `database` property is empty: `"database": {}`. The `DATABASE_ID` is not present for the `id` property. { "data": { "tablesConnection": { "nodes": [ { "id": "TABLE_ID", "database": { "id": "DATABASE_ID" } ...
Missing content	Parent project not found	Raised during the processing of projects, workbooks and published data sources. Contains the following lines: `Workbook with id WORKBOOK_ID is skipped` `Published datasource with id DATASOURCE_ID is skipped` `Project with id PROJECT_ID has unreachable parent project` Indicates that the parent project of a project, workbook, or published data source was not found in the previously processed projects. Possible causes: The `project` property is not present in the JSON file. The `project` property is empty: `"project": {}`. The `PROJECT_ID` is not present for the `id` property. Example for a workbook: { "workbooks": { "workbook": [ { "project": { "id": "PROJECT_ID" }, "id": "WORKBOOK_ID", ... Example for a published datasource: To identify the `PROJECT_ID`, first find the `DATASOURCE_LUID` of the published data source, as returned by the metadata API: { "data": { "datasourcesConnection": { "nodes": [ { "__typename": "PublishedDatasource", "id": "DATASOURCE_ID", "luid": "DATASOURCE_LUID" ... Then, in the data returned by the REST API, reference the `DATASOURCE_LUID` to identify the `PROJECT_ID` of the data source.: { "datasources": { "datasource": [ { "id": "DATASOURCE_LUID", "project": { "id": "PROJECT_ID", ... Example for a project: `PARENT_PROJECT_ID` is not found: { "projects": { "project": [ { "id": "PROJECT_ID", "parentProjectId": "PARENT_PROJECT_ID", ... Project is not skipped in this case. The new parent project is created with name `Unknown project PARENT_PROJECT_ID`.
Missing content Mismatched ID	Parent workbook not found	Raised during processing of worksheets, dashboards, REST-only views and embedded data sources. Contains the following lines: `Worksheet with id WORKSHEET_ID is skipped` `Dashboard with id DASHBOARD_ID is skipped` `View with id VIEW_ID is skipped (rest only)` `Embedded data source with id DATASOURCE_ID is skipped` Indicates that the parent workbook of a worksheet, dashboard or view with a specified ID was not found in the previously processed workbooks. Possible causes: The `workbook` property is not present in the JSON file. The `workbook` property is empty: `"workbook": {}`. `WORKBOOK_ID` is not present for the `luid` property. mismatched ID issue. Example for a worksheet: { "data": { "sheetsConnection": { "nodes": [ { "id": "WORKSHEET_ID", "workbook": { "luid": "WORKBOOK_ID" ... Example for a dashboard: { "data": { "dashboardsConnection": { "nodes": [ { "id": "DASHBOARD_ID", "workbook": { "luid": "WORKBOOK_ID" ... Example for an embedded data source: { "data": { "dashboardsConnection": { "nodes": [ { "id": "DASHBOARD_ID", "workbook": { "luid": "WORKBOOK_ID" ... Note Use the `luid` property, not the `id` property, to find a workbook by ID.
MIssing content	Worksheet not found	Raised during the processing of dashboards. Contains the line: `Worksheet with id WORKSHEET_ID is skipped for dashboard with id DASHBOARD_ID` Indicates that a worksheet with a given ID was not found in the previously processed worksheets. { "data": { "dashboardsConnection": { "nodes": [ { "id": "DASHBOARD_ID", "sheets": [ { "id": "WORKSHEET_ID" }, ... Possible cause: `WORKSHEET_ID` is not present for the `id` property.
Mismatched ID	REST datasource not found	Raised during the processing of published data sources. Contains the line: `Published datasource with id DATASOURCE_ID is skipped` Indicates that a data source with `DATASOURCE_ID` could not be matched with the `DATASOURCE_LUID` returned by the REST API, resulting in a mismatched ID. { "data": { "datasourcesConnection": { "nodes": [ { "__typename": "PublishedDatasource", "id": "DATASOURCE_ID", "luid": "DATASOURCE_LUID" ... During processing, information returned by the metadata API and the REST API is combined. Collibra Data Lineage then looks to the `DATASOURCE_LUID` property in the REST metadata to identify the correct project ID, which is lacking from the information returned by the metadata API. This only applies to published data sources, as embedded data sources are assigned to workbooks, not projects.
Missing content	Datasource not found	Raised during the processing of embedded data sources. Contains the line: `Embedded datasource with id EMBEDDED_DATASOURCE_ID references non existing published datasource with id PUBLISHED_DATASOURCE_ID` Indicates that an embedded data source with `EMBEDDED_DATASOURCE_ID` references a published data source with `PUBLISHED_DATASOURCE_ID`, which was not found in the previously processed data sources. { "data": { "datasourcesConnection": { "nodes": [ { "__typename": "EmbeddedDatasource", "id": "EMBEDDED_DATASOURCE_ID", "upstreamDatasources": [ { "id": "PUBLISHED_DATASOURCE_ID", ... Possible cause: `PUBLISHED_DATASOURCE_ID` is not present for the `id` property.
Missing content	Field relation not found	Raised during the processing of data source fields. Contains the lines: `Referenced field with id FIELD_ID is skipped` `Report field with id FIELD_ID is skipped` Indicates that a field with a given `FIELD_ID` was not found in remote fields, which is needed to create relations. { "data": { "datasourcesConnection": { "nodes": [ { "id": "DATASOURCE_ID", "fieldsConnection": { "nodes": [ { "__typename": "DatasourceField", "remoteField": { "id": "FIELD_ID" ... Possible cause: An embedded datasource has a calculated field that is not mapped to any published data source field. This can occur: During the processing of referenced fields. In this case, the relation between the two Tableau Data Attributes cannot be created. During the processing of report fields. In this case, the relation between the Tableau Data Attribute and the Tableau Data Worksheet cannot be created.
Missing content	Multiple datasources	Raised during the processing of custom SQL queries. Contains the line: `Custom sql query with id QUERY_ID contains columns of NUMBER_OF_DATASOURCES datasources. Found best datasource: DATASOURCE_ID` Indicates that a query with `QUERY_ID` has matched multiple data sources. Only one data source can be used: datasource with `DATASOURCE_ID`. The warning is caused by the fact that there is no direct relation between the query and the data source. The algorithm tries to find the best data source, based on a comparison of the list of query columns and the data source columns. To verify this, do the following: Find the query with `QUERY_ID` and the columns (see `COLUMN_ID`) in the table JSON data: { "data": { "tablesConnection": { "nodes": [ { "__typename": "CustomSQLTable", "id": "QUERY_ID", "columnsConnection": { "nodes": [ { "id": "COLUMN_ID", ... Find the data source with `DATASOURCE_ID` in the data source JSON data. It should contain all of the columns ( see `COLUMN_ID`) that are used by the query: { "data": { "datasourcesConnection": { "nodes": [ { "id": "DATASOURCE_ID", "fieldsConnection": { "nodes": [ { "upstreamColumnsConnection": { "nodes": [ { "id": "COLUMN_ID" ... The data source found for this query (meaning `DATASOURCE_ID`) might not be the right one for the `TECHLIN VIEW` definition. In this case, the data source `DATASOURCE_ID` might have the wrong relations between the Tableau Data Attribute asset and the Column asset.
MIssing content	Datasource not found	Raised during the processing of custom SQL queries. Contains the line: `Custom sql query with id QUERY_ID is skipped` Indicates that query with `QUERY_ID` contains columns that are not referenced by any data source fields, so the data source can’t be assigned to the query. { "data": { "tablesConnection": { "nodes": [ { "__typename": "CustomSQLTable", "id": "QUERY_ID", "columnsConnection": { "nodes": [ { "id": "COLUMN_ID", ...
Missing content	Query parsing error	Raised during the processing of custom SQL queries. Contains the line: `Error parsing query with id QUERY_ID, error: ERROR` Indicates that there is an issue when deriving column names from a query for a custom SQL with `QUERY_ID`. { "data": { "tablesConnection": { "nodes": [ { "__typename": "CustomSQLTable", "id": "QUERY_ID", "query": "QUERY" Custom SQL is still processed as `TECHLIN VIEW` with no columns.
Configuration	Invalid Collibra system names	Raised during the processing of the `collibraSystemNames` section in the <source ID> configuration file. Contains the lines: `Collibra system name not found for database with hostname "DB_HOSTNAME"` `Collibra system name not found for file with path "FILE_PATH"` `Collibra system name not found for connector with url "CONNECTION_URL"` `Collibra system name not found for cloud file with name "CLOUD FILE PATH"`
Configuration	Invalid hostname mapping	Raised during the processing of the `hostnameMapping` section the <source ID> configuration file. Contains the line: `Collibra system name not found for database "DB_NAME" host "HOST_NAME" and schema "SCHEMA_NAME"`

When you ingest Power BI metadata in Data Catalog, Collibra Data Lineage automatically creates a technical lineage for assets of the following types:

Power BI Report
Power BI Table
Power BI Column

To view the technical lineage, go to the asset page of any asset of these types, and then click the Technical Lineage tab.

Important

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

Latest UI Classic UI

Note If you ingest Power BI for the first time or if you change your geolocation or cloud provider, you have to restart the Collibra service before you can see your technical lineage.

Technical lineage graph

The technical lineage graph shows relations of the type "Data Element targets / sources Data Element" between BI assets and other data objects in the data flow, for example Column assets or Power BI Column assets. These relations are created during the Power BI ingestion process as a result of automatic stitching.

Example

The following technical lineage shows the relation of the type "Data Element targets / sources Data Element" between the Column asset LISTPRICE and the Power BI Column asset ListPrice.

Note

When harvesting Power BI, report attributes are not returned by the API. Therefore, for a given report, Collibra Data Lineage creates a dummy report attribute. This dummy report attribute is identified in the technical lineage by an asterisk (*), as shown in the following example image. Links are drawn from all data attributes in the semantic model that were used to create the report, to the dummy report attribute.

Tip Does your database or schema have the name "Default" in the technical lineage graph? This is the case if you use a Power Query M function that doesn’t have the schema or database name specified, or if Power BI hasn't returned the database or schema name. In this case, you can configure database and schema mapping in your <source ID> configuration file, to provide the name of the database or schema. This allows you to achieve stitching and view the lineage you need. For more information, go to Broken stitching and possible solutions.

Sources tab page

The Sources tab page shows the transformation details that were analyzed and processed on the Collibra Data Lineage service instances and the results of this analysis. The success rate of the analysis indicates how complete the technical lineage is.

Important The Collibra Data Lineage server can process most, but not all, complex Power BI metadata. This means that the success rate of a Power BI ingestion can be very high, but almost never 100%.

Example

The following image shows that you have created a technical lineage for four data sources. Power BI has a success rate of 83%. When you use the transformation logs to investigate the errors, you see that the Collibra Data Lineage service instance couldn't process some elements of the Power BI metadata, for example because they are not supported or there is an issue in the configuration file or the Power BI setup.

Note

When you ingest MicroStrategy metadata in Data Catalog, Collibra Data Lineage automatically creates a technical lineage.

To view the technical lineage, click the Technical lineage tab on the asset page of any of the following asset types:

MicroStrategy Report
MicroStrategy Dossier
MicroStrategy Document
MicroStrategy Data Attribute

The Technical lineage tab is only shown if you have the Data Catalog global role with the Catalog and Technical lineageglobal permissions.

Note If you ingest MicroStrategy for the first time or if you change your geolocation or cloud provider, you have to restart the Collibra service before you can see the technical lineage.

Technical lineage graph

The technical lineage graph shows relations of the type "Data Element targets / sources Data Element" between BI assets and other data objects in the data flow, for example Column assets or MicroStrategy Data Attribute assets. These relations are created during the MicroStrategy ingestion process as a result of automatic stitching.

MicroStrategy API limitations

Limitation	Details
Report attributes	When harvesting MicroStrategy, report attributes are not returned by the API. Therefore, for a given report, Collibra Data Lineage creates a dummy report attribute. This dummy report attribute is identified in the technical lineage by an asterisk (*), as shown in the following example image. Links are drawn from all data attributes in the semantic model that were used to create the report, to the dummy report attribute.
Reports	The integration supports all common MicroStrategy reports. However, due to limitations with the MicroStrategy APIs: The following report subtypes are not supported: data mart, bulk export, incremental refresh, and transaction. The following report extended types are not supported: MDX, Query Builder, Freeform XQuery, and Data Import. Any log messages referring to skipped end points are due to the MicroStrategy API limitations. They are not due to an error or lack of functionality on the part of Collibra Data Lineage.
Freeform SQL	The integration supports MicroStrategy Freeform SQL. However, due to limitations with the MicroStrategy APIs, it is only supported for reports, not cubes. Note Freeform SQL is supported for reports (not cubes or dossiers) if you have MicroStrategy update10 or newer, or MicroStrategy ONE.

Limitation

Details

Report attributes

When harvesting MicroStrategy, report attributes are not returned by the API. Therefore, for a given report, Collibra Data Lineage creates a dummy report attribute. This dummy report attribute is identified in the technical lineage by an asterisk (*), as shown in the following example image. Links are drawn from all data attributes in the semantic model that were used to create the report, to the dummy report attribute.

Reports

The integration supports all common MicroStrategy reports. However, due to limitations with the MicroStrategy APIs:

The following report subtypes are not supported: data mart, bulk export, incremental refresh, and transaction.
The following report extended types are not supported: MDX, Query Builder, Freeform XQuery, and Data Import.

Any log messages referring to skipped end points are due to the MicroStrategy API limitations. They are not due to an error or lack of functionality on the part of Collibra Data Lineage.

Freeform SQL

The integration supports MicroStrategy Freeform SQL. However, due to limitations with the MicroStrategy APIs, it is only supported for reports, not cubes.

Note Freeform SQL is supported for reports (not cubes or dossiers) if you have MicroStrategy update10 or newer, or MicroStrategy ONE.

UUIDs in the MicroStrategy technical lineage

Server > Project > Folder > Report > Data Entity > Data Attribute
Server > Project > Folder > Dossier > Data Entity > Data Attribute
Server > Project > Folder > Document > Data Entity > Data Attribute

In Collibra, every node in this hierarchy must have a unique name. However, in MicroStrategy, the names of data objects do not have to be unique. As such, if MicroStrategy data objects in a technical lineage hierarchy have the same full name, Collibra Data Lineage adds the UUIDs of the corresponding assets to the names in the technical lineage, to maintain uniqueness.

In the following example image, the names of the assets Priority, Opened and Active in the technical lineage have been appended with their UUIDs.

Note

UUIDs are not added to the names of the assets themselves; they are only added to the names of the data objects in the technical lineage.
The UUID is always part of the full name of an asset, regardless of whether or not it is a duplicate.

To keep Collibra Data Lineage from adding UUIDs to the names of the data objects in a technical lineage, ensure that the names of all data objects in MicroStrategy are unique.

Sources tab page

The Sources tab page shows the expressions that the Collibra Data Lineage service analyzed and processed, and the results of the analysis. It also shows the TECHLIN VIEW query definitions, based on custom SQL queries.

Note MicroStrategy uses the term "expressions" instead of "transformations".

Source code is provided for the following MicroStrategy asset types:

MicroStrategy Document
MicroStrategy Dossier
MicroStrategy Report
MicroStrategy Data Entity

The success rate of the analysis indicates how complete the technical lineage is.

For example, the following image shows that you have created a technical lineage for two data sources. SAP HANA has a success rate of 83%. When you use the transformation logs to investigate the errors, you see that the Collibra Data Lineage service instance couldn't process some elements of the SAP HANA metadata, for example because they are not supported or because there is an issue in the configuration file.

Note

When you ingest Looker metadata, you automatically create a technical lineage for Looker Look assets. If you have the right permissions to view the technical lineage, you can go to a Looker Look asset page and click the Technical lineage tab, which allows you to access the technical lineage.

Important

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

Latest UI Classic UI

Example

The following technical lineage graph shows the technical lineage of Looker objects.

When you ingest SQL Server Reporting Services (SSRS) and Power BI Report Server (PBRS) metadata in Data Catalog, you automatically create a technical lineage for SSRS Column assets. Each SSRS Column asset page has a Technical lineage tab page that shows the technical lineage of that asset Column asset.

We cannot access PBRS lineage information. As a result, you can only create a technical lineage for SSRS Column assets.

Note If you ingest SSRS and PBRS for the first time, or if you change your geolocation or cloud provider, you might have to restart the Collibra service before you can see the technical lineage.

Important

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

Latest UI Classic UI

Technical lineage graph

The technical lineage graph shows relations of the type "Column is source for / is target of Data Attribute" between BI assets and other data objects in the data flow, for example Column assets or SSRS Column assets. These relations are created during the ingestion process as a result of automatic stitching.

For more information about the technical lineage, see the Collibra Data Lineage section in the documentation.

Example

The following technical lineage shows the relation of the type "Data Element sources / targets Data Element" between the Column assets FOOD_NAME, FOOD_TYPE and FOOD_CODE and the SSRS Column assets food_name, food_type and food_code.

Sources tab page

The Sources tab page shows the transformation details that the Collibra Data Lineage service analyzed and processed and the results of this analysis. The success rate of the analysis indicates how complete the technical lineage is.

Important The Collibra Data Lineage service can process most, but not all complex metadata. This means that the success rate of an ingestion job can be very high, but might not be 100%.

Note