Prepare the lineage harvester configuration file for Power BI (deprecated)

You have to prepare a technical lineage configuration file and run the lineage harvester to fetch the Power BI analysis results on the Collibra Data Lineage server and sent them as an import job to your Collibra Data Intelligence Cloud.

Note Comments in the lineage harvester configuration file are not supported.

Tip For more information, see Collibra Data Lineage.

Prerequisites

Steps

  1. Start the lineage harvester to create an empty lineage harvester configuration file by entering the following command:
    • Windows: .\bin\lineage-harvester.bat
    • For other operating systems: chmod +x bin/lineage-harvester and then bin/lineage-harvester
    An empty configuration file is created in the config folder.
  2. Open the configuration file and enter the values for each property.
    PropertiesDescription
    general

    This section describes the connection information between the lineage harvester and Data Catalog.

    catalog

    This section contains information that is necessary to connect to Data Catalog.

    url

    The URL of your Collibra Data Intelligence Cloud environment.

    Note You can only enter the public URL of your Collibra DGC environment. Other URLs will not be accepted.

    username

    The username that you use to sign in to Collibra.

    sources

    This section describes the data sources for which you want to create the technical lineage. You have to create a configuration section for each data source.

    Note You can add multiple data sources to the same configuration file.

    type
    The kind of data source. In this case, the value has to be ExistingLineage.
    id

    The unique ID to identify the Power BI service metadata that was uploaded to the Collibra Data Lineage server. The value has to be the same as the value you used in the sourceId property in the Power BI configuration file.

    Tip This value can be anything as long as it is a unique ID and the same as the value of the sourceId property in the Power BI configuration file. The Power BI and lineage harvesters use the ID to identify a batch of data on the Collibra Data Lineage server.

    Tip If you want to ingest multiple Power BI applications, create a separate Power BI configuration file for each Power BI application each with a unique source ID. Duplicate the Power BI section in the lineage harvester configuration file and enter the source ID in the ID property.
  3. Save the configuration file.
  4. Start the lineage harvester again in the console and run the following command:
    • for Windows: .\bin\lineage-harvester.bat full-sync
    • for other operating systems: ./bin/lineage-harvester full-sync
  5. When prompted, enter the passwords to connect to your Collibra Data Intelligence Cloud environment.
    The password is encrypted and stored in /config/pwd.conf

What's next?

The lineage harvester triggers Collibra to import Power BI assets and their relations and create a technical lineage for Power BI Column assets. Collibra also stitches the new Power BI assets to existing assets in Data Catalog.

To refresh the Power BI metadata in Data Catalog, you can run the Power BI harvester and lineage harvester again or schedule jobs to run them automatically.

Tip You can check the progress of the Power BI ingestion and technical lineage creation in Activities. The Results field indicates how many relations were imported into Data Catalog.

Warning When you run the harvesters, Collibra Data Lineage creates all Power BI assets in the same Data Catalog BI domain. We highly recommend that you do not move these assets to another domain. If you move assets to another domain, they will be deleted and recreated in the initial Data Catalog BI domain when you synchronize Power BI. As a consequence, all manually added data of those assets is lost.