Prepare the lineage harvester configuration file for Power BI Report Server

You have to prepare a technical lineage configuration file before you run the lineage harvester. The lineage harvester collects your Power BI Report Server metadata and sends it to Collibra Data Intelligence Cloud, where it is processed and analyzed. Collibra then imports the Power BI Report Server assets and relations to Data Catalog.

Tip We highly recommend to use the configuration file generator to make sure your configuration file is valid.

Prerequisites

Steps

  1. Run the following command line to start the lineage harvester:
    • Windows: .\bin\lineage-harvester.bat
    • for other operating systems: chmod +x bin/lineage-harvester and then bin/lineage-harvester
    An empty configuration file is created in the lineage harvester config folder.
  2. Open the lineage-harvester.conf file and enter the values for each property.
    PropertiesDescription
    general

    This section describes the connection information between the lineage harvester and Data Catalog.

    catalog

    This section contains information that is necessary to connect to Data Catalog.

    url

    The URL of your Collibra Data Intelligence Cloud environment.

    Note You can only enter the public URL of your Collibra Data Intelligence Cloud environment. Other URLs will not be accepted.

    username

    The username that you use to sign in to Collibra.

    useCollibraSystemName

    Indication whether you want to include the system or server name of a data source to differentiate between data sources with the same name.

    Tip We highly recommend to only set useCollibraSystemName to true if you want to enable the Collibra Data Lineage server to process multiple databases with the same name.

    sources

    This section contains all Power BI Report Server connection properties.

    collibraSystemName

    The system or server that you use when you ingest Power BI Report Server.

    If you only want to ingest one Power BI Report Server source, this property is optional. If you want to ingest multiple Power BI Report Server sources, you have to leave this property empty and create a Power BI Report Server <source ID> configuration file.

    Note If the useCollibraSystemName is set to true, you have to include a separate configuration file that maps the data objects in Power BI Report Server to a system > database > schema > table > column structure.

    id

    The unique ID to identify the Power BI Report Server metadata that was uploaded to the Collibra Data Lineage server.

    Tip This value can be anything as long as it is a unique. The lineage harvester uses the ID to identify a batch of data on the Collibra Data Lineage server.

    type
    The kind of data source. In this case, the value has to be PBIRS.
    url

    The URL to the Power BI Report Server web portal. By default, the URL is http://<computer-name>/reports.

    userName

    The username you use to sign in to the Power BI Report Server web portal.

    Tip If you use NTLM authentication, your username also contains the NTLM domain name. For example MyDomain\\username.

    domainId

    The unique ID of the domain in Collibra Data Intelligence Cloud in which you want to ingest the Power BI assets.

    folderFilter

    An option to exclude specific Power BI Report Server folders that contain reports or KPIs from the ingestion process.

    You can add multiple folders by listing folder names, providing the full path to folders or by using a wildcard:

    • Use folder names when the folder name is unique: ["PBIRS folder 1", "PBIRSfolder 2"]
    • Use the full path to the folder to only ingest a specific folder: ["/database1/folder1", "/database2/folder2"]
    • Use a wildcard to ingest all child folders or a specific folder: ["/folder1/*", "/folder2/*"]

    You can also use a combination of these methods. For example, ["PBIRS folder 1", "/database/folder2", /folder3/*"]

    If the folderFilter field remains empty or is deleted from the configuration file, all accessible Power BI Report Server folders are processed and ingested.

    Tip For more information about connecting to a Power BI Report Server folder, see the Microsoft documentation.

  3. Save the configuration file.
  4. Start the lineage harvester again in the console and run the following command:
    • for Windows: .\bin\lineage-harvester.bat full-sync
    • for other operating systems: ./bin/lineage-harvester full-sync
  5. When prompted, enter the passwords to connect to Collibra and Power BI Report Server. Do one of the following:
    • Enter the passwords in the console.
      The passwords are encrypted and stored in /config/pwd.conf.
    • Provide the passwords via command line.
      The passwords are stored locally and not in your lineage harvester folder.

Example

{
 "general": {
  "catalog": {
   "url": "https://<organization>.collibra.com",
   "userName": "<your-collibra-username>"
  }
 },
 "sources": {
  "collibraSystemName" : "",
  "id": "<Power BI Report Server-id>",
  "type": "PBIRS",
  "url": ""http://<IP address>/Reports.com",
  "userName": "<Power BI Report Server-api-user-name>",
  "domainId": "<domain-resource-id>",
  "folderFilter":  ["/Folder1/*", "Folder2"]
 }
}

Tip We highly recommend you to use the lineage harvester configuration file generator when you create a configuration file.

What's next?

The lineage harvester triggers Collibra to import Power BI Report Server assets and their relations.

If issues occur during the Power BI Report Server ingestion process, check the Collibra Data Lineage troubleshooting section to solve your problems.

To synchronize the Power BI Report Server metadata, you can run the lineage harvester again or schedule jobs to run them automatically.

Tip You can check the progress of the Power BI Report Server ingestion in Activities. The results field indicates how many relations were imported into Data Catalog.