Overview Power BI integration steps (deprecated)

The Power BI integration enables you to harvest Power BI metadata and create new Power BI assets in Data Catalog. Collibra analyzes and processes the BI metadata and presents it as specific asset types, retaining their original names.

Tip To ingest Power BI metadata in Data Catalog, you need to run two different harvesters: the Power BI harvester and the lineage harvester. The order in which you run the harvesters is important. You first have to run the Power BI harvester to collect the metadata from your Power BI application and then run the lineage harvester to import new Power BI assets and their relations in Data Catalog. The Power BI ingestion workflow explains which roles the harvesters play in the Power BI ingestion process.

Steps

The table below shows the steps and prerequisites required to integrate Power BI in Data Catalog. These steps are best practices, which means that some of them might be optional, but highly recommended.

Step

What?

Description

Prerequisites

1

Set up a Power BI application.

Before you start the Power BI integration in Data Catalog, make sure that the Power BI harvester can reach the Power BI metadata. Perform these tasks before you start the actual Power BI ingestion process:

Warning Because these tasks are performed outside of Collibra, it is possible that the content changes without us knowing. We strongly recommend that you carefully read the source documentation.

You have a Power BI subscription.

2

Create a new domain.

Before you can ingest Power BI metadata, you have to create a new domain or choose an existing domain to store the new Power BI assets.

You have a resource role with the following resource permissions:

  • Domain: Add
3

Optionally, assign the attribute type State to the global assignment of the Power BI Workspace asset type

On Power BI Workspace asset pages, you can include the attribute type State, to show the state of ingested Power BI workspaces. To do so, you have to edit the global assignment of the Power BI Workspace asset type and assign the attribute type State.

If you delete a Power BI workspace, the workspace is maintained for a 90-day grace period. During the grace period, the workspace has the state Deleted. When you ingest Power BI metadata in Data Catalog, this deleted workspace is ingested.

For complete information on Power BI workspaces and possible states, see the Microsoft Power BI documentation.

You have a global role that has the System administration global permission.
4

Ingest or import assets from supported JDBC data sources.

The Collibra Data Lineage server connects to Data Catalog and reads the full paths of existing assets. When the full path matches the full path of assets in Power BI, the Collibra Data Lineage server automatically stitches them.

Permissions depend on how you ingest or import the assets.
5 Download and install the Power BI harvester.

You use the Power BI harvester to collect metadata from Power BI and upload it to Collibra where the metadata is scanned, processed and analyzed.

The installer file contains the following:

  • a config folder with an empty configuration file.
  • a bin folder.
  • a TXT file with more information about the configuration file.
  • a BAT file that you use to run the harvester.
  • You have Collibra Data Intelligence Cloud 2020.11 or newer.
  • You have access to the Power BI harvester on the Downloads page.
  • Your environment meets the system requirements to install and use the Power BI harvester.
  • You have added Firewall rules so that the Power BI harvester can connect to the Collibra Data Lineage server with one of the following IP addresses:
    • 15.222.200.199 (techlin-aws-ca.collibra.com)
    • 18.198.89.106 (techlin-aws-eu.collibra.com)
    • 13.228.38.245 (techlin-aws-sg.collibra.com)
    • 54.242.194.190 (techlin-aws-us.collibra.com)
    • 51.105.241.132 (techlin-azure-eu.collibra.com)
    • 20.102.44.39 (techlin-azure-us.collibra.com)
    • 35.197.182.41 (techlin-gcp-au.collibra.com)
    • 34.152.20.240 (techlin-gcp-ca.collibra.com)
    • 35.205.146.124 (techlin-gcp-eu.collibra.com)
    • 34.87.122.60 (techlin-gcp-sg.collibra.com)
    • 35.234.130.150 (techlin-gcp-uk.collibra.com)
    • 34.73.33.120 (techlin-gcp-us.collibra.com)

Important Ingestion results vary according to your Power BI subscription.

6

Prepare the Power BI configuration file and run the Power BI harvester.

You create a configuration file to provide the connection information that you need to connect your Power BI application to Collibra and to the domain in which you want to ingest the Power BI assets.

You can access an empty configuration file in the Power BI harvester installation folder. When you have created and saved the configuration file, you can run the Power BI harvester which uploads the Power BI metadata to Collibra.

Click here to see an example.
{
 "powerbi": {
  "tenantdomain": "<organization.onmicrosoft.com>",
  "applicationId": "<microsoft-azure-id>",
  "userName": "<your-power-bi-email-address",
  "password": "<password-to-access-power-bi>",
  "workspaceFilter": "<filter-workspace-name>"
 },
 "techlin": {
  "sourceId" : "<unique-power-bi-ID>"
 },
 "catalog": {
  "domainId": "<your-catalog-domain>",
  "url": "<url-to-collibra>",
  "userName": "<my-collibra-username>",
  "password": "<my-collibra-password>"
 }
}

Tip For a full ingestion, we highly recommend to have a Power BI Premium subscription.

7

Download and install the lineage harvester.

You use the lineage harvester to trigger the creation of Power BI assets, their relations and a technical lineage in Data Catalog.

You can download the lineage harvester from the Collibra Product Resource Downloads page.

Your environment meets the system requirements to install and use the lineage harvester.

8

Prepare the lineage harvester configuration file and run the lineage harvester.

You create a lineage harvester configuration file with Power BI connection information and run the lineage harvester to import the results of the Power BI integration and the technical lineage for Power BI into Data Catalog.

As a result, Collibra creates new Power BI assets in Data Catalog and imports relations between these assets. It also creates a technical lineage for Power BI assets and other data sources in the lineage harvester configuration file.

Click here to see an example.
{
 "general" : {
  "catalog" : {
   "url" : "https://catalog-instance.collibra.com",
   "username" : "Admin"
   }
 },
 "sources" : [ {
  "type" : "ExistingLineage",
  "id" : "power-bi-1"
 } ]
}

Tip For more information about the lineage harvester, see the Collibra Data Lineage documentation.

9 View the Power BI assets and technical lineage

After the Power BI metadata is ingested in Data Catalog, you can go to the domain where you ingested Power BI and see the list of ingested Power BI assets. These assets are automatically stitched to existing assets in Data Catalog.

You can go to a Power BI Column asset page and click the Technical lineage lineage tab to view the technical lineage.

Note If you ingest Power BI for the first time or if you change your geolocation or cloud provider, you have to restart the DGC service before you can see your technical lineage.

Warning When you run the harvesters, Collibra Data Lineage creates all Power BI assets in the same Data Catalog BI domain. We highly recommend that you do not move these assets to another domain. If you move assets to another domain, they will be deleted and recreated in the initial Data Catalog BI domain when you synchronize Power BI. As a consequence, all manually added data of those assets is lost.

You have a Data Catalog global role with the Technical lineage global permissions.

Note The order in which you run the harvesters is important. You first have to run the Power BI harvester to collect the metadata from your Power BI application and then run the lineage harvester to import new Power BI assets and their relations in Data Catalog.