The metadata harvesting process (deprecated)
Collibra uses two methods to harvest Power BI metadata: via REST API calls and via XMLA endpoints. The REST API retrieves basic metadata, and XMLA endpoints retrieve more specific metadata.
To enable the lineage harvester to access metadata in Power BI workspaces, you must add the workspaces to a Power BI Premium dedicated capacity and have the correct configurations in Microsoft Azure.
Note There are some limitationslimitations to the metadata harvesting process. Ensure that you understand these limitations before you start the harvesting process.
The following table shows which metadata the Power BI harvester retrieves and how.
| Metadata about... | is retrieved using... |
|---|---|
| Reports | Microsoft Azure Admin Power BI REST API calls. |
| Data set columns and lineage |
The content in this topic differs according to the authentication method.
Overview of the metadata harvesting process with username / password authentication
|
Step |
Retrieved via | Description |
|---|---|---|
|
1 |
Power BI API calls |
The Power BI harvester uses the username, password and application ID to access the Power BI APIs. These APIs retrieve basic Power BI metadata, for example metadata in the Power BI tenant or server and reports. |
|
2 |
XMLA |
You add the Azure Active Directory user with a Power BI admin role in Power BI to a security group and grant him the Contributor role in Power BI workspaces. You add the Power BI workspaces that you want to ingest to the same security group. As a result, the Power BI harvester uses XMLA endpoints to retrieve more specific metadata, for example Power BI columns and lineage. Specific metadata from Power BI workspaces is only harvested if you added the Power BI workspaces to the Power BI dedicated capacity and you have the necessary permissions to harvest the metadata.. |
Note Make sure that all necessary dedicated capacities are running and accessible to the Power BI harvester. If not, creating assets for Power BI data sets and your technical lineage may fail.
Overview of the metadata harvesting process with service principal authentication
|
Step |
Retrieved via | Description |
|---|---|---|
|
1 |
Power BI API calls |
The Power BI harvester uses the application ID and the client secret key of the Azure Active Directory application to access the Power BI APIs. These APIs retrieve basic Power BI metadata, for example metadata in the Power BI tenant or server and reports. |
|
2 |
XMLA |
You add the service Principal to a security group and grant it the Contributor role in the Power BI workspaces. As a result, the Power BI harvester uses XMLA endpoints to retrieve more specific metadata, for example in Power BI columns and lineage. Specific metadata from Power BI workspaces is only harvested if you add the Power BI workspaces to the dedicated capacity and you have the necessary permissions to harvest the metadata. |
Note Make sure that all necessary dedicated capacities are running and accessible to the Power BI harvester. If not, creating assets for Power BI data sets and your technical lineage may fail.