Prepare the lineage harvester configuration file for Power BI (deprecated)
You have to prepare a technical lineage configuration file and run the lineage harvester to fetch the Power BI analysis results on the Collibra Data Lineage server and sent them as an import job to your Collibra Data Intelligence Cloud.
Note Comments in the lineage harvester configuration file are not supported.
Tip For more information, see Collibra Data Lineage.
Prerequisites
- You have prepared the Power BI configuration file and executed the Power BI harvester.
- You have a global role that has the Manage all resources global permission.
- You have a global role with the Catalog global permission, for example Catalog Author.
- You have the Technical lineage global permission.
- You have created a BI Catalog domain in which you want to ingest the Power BI assets.
-
A resource role with the following resource permission on the community level in which you created the BI Data Catalog domain:
- Asset: add
- Attribute: add
- Domain: add
- Attachment: add
-
You have downloaded lineage harvester version 2022.05 or newer. We highly recommend that you always install and use the newest lineage harvester.
Steps
- Start the lineage harvester to create an empty lineage harvester configuration file by entering the following command:
- Windows:
.\bin\lineage-harvester.bat
- For other operating systems:
chmod +x bin/lineage-harvesterand thenbin/lineage-harvester
An empty configuration file is created in the config folder.
- Windows:
-
Open the configuration file and enter the values for each property.
Properties Description general This section describes the connection information between the lineage harvester and Data Catalog.
catalogThis section contains information that is necessary to connect to Data Catalog.
urlThe URL of your Collibra Data Intelligence Cloud environment.
Note You can only enter the public URL of your Collibra DGC environment. Other URLs will not be accepted.
usernameThe username that you use to sign in to Collibra.
sources This section describes the data sources for which you want to create the technical lineage. You have to create a configuration section for each data source.
Note You can add multiple data sources to the same configuration file.
typeThe kind of data source. In this case, the value has to be ExistingLineage. idThe unique ID to identify the Power BI service metadata that was uploaded to the Collibra Data Lineage server. The value has to be the same as the value you used in the
sourceIdproperty in the Power BI configuration file.Tip This value can be anything as long as it is a unique ID and the same as the value of the
sourceIdproperty in the Power BI configuration file. The Power BI and lineage harvesters use the ID to identify a batch of data on the Collibra Data Lineage server.Tip If you want to ingest multiple Power BI applications, create a separate Power BI configuration file for each Power BI application each with a unique source ID. Duplicate the Power BI section in the lineage harvester configuration file and enter the source ID in the ID property. - Save the configuration file.
- Start the lineage harvester again in the console and run the following command:
- for Windows:
.\bin\lineage-harvester.bat full-sync - for other operating systems:
./bin/lineage-harvester full-sync
- for Windows:
- When prompted, enter the passwords to connect to your Collibra Data Intelligence Cloud environment.The password is encrypted and stored in /config/pwd.conf
What's next?
The lineage harvester triggers Collibra to import Power BI assets and their relations and create a technical lineage for Power BI Column assets. Collibra also stitches the new Power BI assets to existing assets in Data Catalog.
To refresh the Power BI metadata in Data Catalog, you can run the Power BI harvester and lineage harvester again or schedule jobs to run them automatically.
Tip You can check the progress of the Power BI ingestion and technical lineage creation in Activities. The Results field indicates how many relations were imported into Data Catalog.
Warning When you run the harvesters, Collibra Data Lineage creates all Power BI assets in the same Data Catalog BI domain. We highly recommend that you do not move these assets to another domain. If you move assets to another domain, they will be deleted and recreated in the initial Data Catalog BI domain when you synchronize Power BI. As a consequence, all manually added data of those assets is lost.