Note This feature is not available on Collibra Cloud for Government.

DAX analysis via Collibra AI (Beta)

Data Analysis Expressions (DAX) analysis via Collibra AI:

  • Creates column-level lineage that includes your calculated columns and measures in Power BI.
  • Enables stitching between calculated columns in the technical lineage and the corresponding Power BI Column assets in Data Catalog.

Note This feature is in Beta testing and is powered by Collibra AI.

DAX is a programming language that is used in Power BI for creating calculated columns, measures and custom tables. Calculated columns and measures in Power BI are ingested in Data Catalog as Power BI Column assets.

Prior to this feature, Power BI columns that are derived from DAX were shown in the technical lineage, but because the Collibra Data Lineage service instances are unable to analyze DAX, stitching between calculated columns in the technical lineage and the corresponding Power BI Column assets in Data Catalog was not possible.

Tip For information on how we leverage AI in our products, visit the Collibra Trust Site.

Enable DAX analysis

To enable this feature:

How does DAX analysis via Collibra AI work?

  1. During Power BI integration, the lineage harvester or Edge sends a request to the Collibra Data Lineage service instance, to process the Power BI metadata.
  2. The Collibra Data Lineage service instance processes the metadata as usual, but the DAX queries are collected separately.
  3. The Power BI scanner connects to the authentication server, to request a token.
  4. The token is used to make a request of the proxy API.
    1. The DAX queries are sent to the Vertex AI model, where they are analyzed. The output is a list of tables and columns for which lineage relations are created in Data Catalog.
      Note Vertex AI has no context about the DAX queries. Vertex AI only analyzes the DAX queries and sends them to the Power BI scanner, via the proxy API, in a JSON file.
    2. Vertex AI sends a JSON file to the Power BI scanner.
    3. The Power BI scanner continues processing the tables and columns, just as it does for Power Query M functions, and creates the technical lineage.
  5. The technical lineage is ingested in Collibra and new relations are created between Power BI Column assets.

Attributes on Power BI Column asset pages

On Power BI Column asset pages, there are two attribute types that are relevant for this feature: the Role in Report and Calculation Rule attributes.

Important 

In Collibra 2024.02, we've launched a new user interface (UI) in beta for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

Role in Report

For measures, the Power BI API returns sufficient information for Collibra Data Lineage to identify the columns as such. Therefore, the value of the Role in Report attribute is appropriately shown as Measure.

For calculated columns, the Power BI API can't distinguish between native columns (those read from the data source or evaluated by a query written in Power Query M or Power BI) and calculated columns. Therefore, no value is shown for the Role in Report attribute.

Calculation Rule

The Calculation Rule attribute type shows the DAX query of calculated columns and measures, and calculated tables.

If the Calculation Rule attribute type is not showing on the asset page, ensure that the attribute type is part of the global assignment of the Power BI Column and Power BI Table asset types. By default, it is not included.

For more information, go to Show DAX calculations on Power BI Column asset pages.

Results in the technical lineage graph

Prior to DAX analysis via Collibra AI, column-level lineage showed only the calculated column and how it related to upstream and downstream data sources. It was not possible to know which columns are aggregated to derive the calculated column. Now, much more information is now shown.

With DAX analysis enabled:

  • All of the columns that are aggregated to derive the calculated column are shown in the technical lineage graph.
  • The column names are shown in blue text.
    Note If the names of calculated columns and measures in the lineage are not shown in blue text, it could be that:
    • DAX analysis is not enabled.
    • DAX analysis is enabled, but the calculation is static, so there is no lineage.

In the following example image:

  • The six native columns () that are aggregated to derive the column Column_ACCRINT_Ex1 are shown in the graph.
  • Outbound edges from the data source () show the relation between the columns in the data source and the native columns in the downstream data set ().

If you zoom out to the table-level lineage:

  • The columns that are aggregated () to derive the calculated columns are shown in the graph.
  • The calculated columns and measures () are shown.

Note Collibra Data Lineage cannot create lineage for DAX calculations that involve numbers, as is the case, for example with, the ACOT function.

Tooltips in the technical lineage graph

If you hover over a calculated column, the tooltip identifies it as "Power BI Column (Calculated)".

If the column is a measure, the tooltip identifies it as "Power BI Column (Measure)".

Review the results of DAX analysis

  1. Open the asset page of a Power BI Report, Power BI Column or Power BI Table, and then click Technical lineage.
  2. In the technical lineage toolbar, click .
    The source code pane opens. Here you can see the DAX query and compare it to the lineage.
  3. In the technical lineage toolbar, in the drop-down list, select Objects and Transformations, to see a high-level view.

Costs and character limit

Collibra Data Lineage calculates the number of characters in the DAX queries that are sent to the Vertex AI model, and the number of characters in the analyzed queries that Vertex AI returns to the Collibra Data Lineage service instance for processing. The total number of input, output, and embedded characters determines the cost of the analysis.

Collibra covers the cost of the analysis. We do, however, limit the number of DAX characters that Collibra Data Lineage will process for a job. The limit is intended to restrict only the largest amounts of metadata, in consideration of the cost of processing. Under certain circumstances, we can raise the limit. For more information, contact Collibra Support.

If the character limit is reached, the following error message is shown in the source code:

Maximum amount of DAX queries is reached. Please use workspace filtering.

If this occurs, we encourage you to use workspace filtering to limit the amount of Power BI metadata that is processed.

Measures to reduce costs and drive quality

We've taken the following measures to help reduce costs and drive quality.

Cached queries and results

The first time you integrate or synchronize Power BI after enabling this feature, Collibra Data Lineage processes all of your DAX queries – with respect to the character limit – and caches them. The analyzed queries that are returned by Vertex AI are also cached. As such, during subsequent synchronizations, if a specific DAX query is encountered, it doesn't need to be processed again. Caching helps to reduce costs, improve performance, and ensure higher-quality results from Vertex AI.

If you reach the character limit, Collibra Data Lineage only shows the lineage of the cached calculations. No new lineage will be created, but this does not negatively affect the ingestion job.

Example store

To help prevent against missing lineage, we've built an example store, which is a collection of example DAX queries. Vertex AI refers to the example store to know if a specific query format is known and, if so, how it was previously processed. Upon request, we can add new format examples to the store, to have a more complete collection and reduce the chances of an incorrect or missing lineage.

If you encounter incorrect or missing lineage, please create a Support ticket. We will either anonymize your query or build it in our own environment and then add it to the example store.

AI temperature setting

AI temperature is a number between 0 and 2, which determines the randomness of the output. The higher the setting, the more random the results. The lower the setting, the more deterministic the results.

We have set the AI temperature for this feature to 0, to obtain less random, more deterministic results. In other words, we would prefer to not create a lineage over creating an incorrect lineage.