Overview Power BI integration steps
The Power BI integration enables you to harvest Power BI metadata and create new Power BI assets in Data Catalog. Collibra analyzes and processes the BI metadata and presents it as specific asset types, retaining their original names.
Steps
The table below shows the steps and prerequisites required to integrate Power BI in Data Catalog. These steps are best practices, which means that some of them might be optional, but highly recommended.
Important In the global assignment of each asset type included in the
|
Step |
What? |
Description |
Prerequisites |
|---|---|---|---|
|
1 |
Set up a Power BI application. |
Before you start the Power BI integration in Data Catalog, make sure that the lineage harvester can reach the Power BI metadata. Perform these tasks before you start the actual Power BI ingestion process:
Warning Because these tasks are performed outside of Collibra, it is possible that the content changes without us knowing. We strongly recommend that you carefully read the source documentation. |
|
|
2 |
Prepare one or more new domains. |
Before you can ingest Power BI metadata, you have to designate a domain for storing the new Power BI assets. You can choose an existing domain or create one or more new domains. Note Make note of the reference ID of the domain. You need to mention the reference ID in the lineage harvester configuration file. |
|
|
3 |
Optionally, assign the attribute type State to the global assignment of the Power BI Workspace asset type. |
On Power BI Workspace asset pages, you can include the attribute type State, to show the state of ingested Power BI workspaces. To do so, you have to edit the global assignment of the Power BI Workspace asset type and assign the attribute type State. If you delete a Power BI workspace, the workspace is maintained for a 90-day grace period. During the grace period, the workspace has the state Deleted. When you ingest Power BI metadata in Data Catalog, this deleted workspace is ingested. For complete information on Power BI workspaces and possible states, see the Microsoft Power BI documentation. |
You have a global role that has the System administration global permission. |
|
4 |
Download and install the lineage harvester. |
You use the lineage harvester to trigger the creation of Power BI assets, their relations and a technical lineage in Data Catalog. We highly recommend that you always install and use the newest lineage harvester. You can download the lineage harvester from the Collibra Product Resource Downloads page. |
|
|
5 |
Prepare the lineage harvester configuration file and run the lineage harvester. |
You create a lineage harvester configuration file with Power BI connection information and run the lineage harvester to import the results of the Power BI integration and the technical lineage for Power BI into Data Catalog. As a result, Collibra creates new Power BI assets in Data Catalog and imports relations between these assets. It also creates a technical lineage for Power BI assets and other data sources in the lineage harvester configuration file.
{
"general": {
"catalog": {
"url": "https://catalog-instance.collibra.com",
"username": "Admin"
},
"useCollibraSystemName": false
},
"sources": [ {
"type": "PowerBI",
"id": "power-bi-id",
"tenantDomain": "collibrapowerbi.onmicrosoft.com",
"loginFlow": {
"type": "ServicePrincipal",
"applicationId": "ab123cde-1234-1234-1234-abcd12e34fg5"
},
"domainId": "domain-reference-ID",
"deleteRawMetadataAfterProcessing": true
} ]
}
Important If the Tip For more information about the lineage harvester, see the Collibra Data Lineage documentation. |
|
|
6 |
Prepare the Power BI <source ID> configuration file. |
If the |
You know the names or IDs of the capacities or workspaces you want to ingest. |
| 7 | Manually refresh your Power BI datasets. |
Important Carry out this step only if this is the first time you're integrating Power BI in Data Catalog. The first time you integrate Power BI, you need to make sure that the data in your Power BI datasets is up-to-date. After that, Microsoft automatically refreshes the datasets every 90 days. For complete information, see: |
See Power BI prerequisites. |
| 8 | Run the lineage harvester again |
Important Carry out this step only if this is the first time you're integrating Power BI in Data Catalog.
When prompted, enter the passwords to connect to your Collibra environment. The password is encrypted and stored in /config/pwd.conf |
Same as for step 5. |
|
9 |
View the Power BI assets and technical lineage |
After the Power BI metadata is ingested in Data Catalog, you can go to the domain where you ingested Power BI and see the list of ingested Power BI assets. These assets are automatically stitched to existing assets in Data Catalog. You can go to a Power BI Column asset page and click the Technical lineage lineage tab to view the technical lineage. Note If you ingest Power BI for the first time or if you change your geolocation or cloud provider, you have to restart the DGC service before you can see your technical lineage. Warning When you run the lineage harvester, Collibra Data Lineage creates all Power BI assets in the Data Catalog BI domain (or domains) you specified in the Power BI <source ID> configuration file. We highly recommend that you do not move these assets to other domains. If you move assets to other domains, they will be deleted and recreated in the initial Data Catalog BI domains when you synchronize Power BI. As a result, all manually added characteristics of those assets are lost. |
|