Prepare Power BI <source ID> configuration file

The lineage harvester uses a lineage harvester configuration file to collect the Power BI data objects. It then sends the metadata to the Collibra Data Lineage server. However, if the useCollibraSystemName property in the lineage harvester configuration file is set to true, you also have to provide a <source ID> configuration file that defines the system name of databases in Power BI.

Collibra Data Lineage uses the system names to match the structure of databases in Power BI to assets in Data Catalog.

Tip

You can also include a filters section in your <source ID> configuration file, to specify the Power BI workspaces from which you want to ingest metadata.
The name "<source ID>" refers to the value of the sourceId property in the lineage harvester configuration file.

Steps

Create a new JSON file in the lineage harvester config folder.
Give the JSON file the same name as the value of the sourceId property in the lineage harvester configuration file.
Example The value of the sourceId property in the lineage harvester configuration file is power-bi-source-1. Therefore, the name of your JSON file should be power-bi-source-1.conf.
Important Your JSON file must have the file extension .conf.

For each database in Power BI, add the following content to the JSON file:

Property

Description

Mandatory?

found_dbname=<database name>;found_hostname=<server name>;found_schema=<schema name>

The database information of supported data sources in Power BI that is typically collected by the lineage harvester. It describes on which server a database is running (found_hostname), what the name of the database is (found_dbname), and optionally, what the name of the schema is (found_schema).

Tip

You can use wildcards to capture multiple connection string combinations:

Yes

dbname

The name of the database of a supported data source in Power BI.

schema

The name of the default schema of a supported data source in Power BI.

If the lineage harvester fails to find a specific schema, it uses the default schema.

dialect

The dialect of the supported data source in Power BI.

collibraSystemName

The system or server name of a database.

If you don't specify a value for this property, the result will be "DEFAULT".

Warning The value of this property must exactly match the name of your System asset in Collibra.

Important If you are using a <source ID> configuration file for the purpose of providing the true system name of an ODBC database in Power BI, you are not required to:

Set the useCollibraSystemName property in the lineage harvester configuration file to true.
Specify a Collibra system name in the <source ID> configuration file.

However, if the useCollibraSystemName property is set to true in the lineage harvester configuration file, then you must specify a Collibra system name in the <source ID> configuration file.

Yes

(unless you are using a <source ID> file to provide the true system names of ODBC databases in Power BI.)

filters

This section allows you to specify the Power BI workspaces from which you want to ingest metadata.

Warning If you don't want to specify the Power BI workspaces from which to ingest, you must completely remove this filters section.

Note The filters work as "workspace AND workspace AND capacity AND capacity", meaning that if you specify a capacity, all of the workspaces in that capacity are also ingested.

Tip

You can use wildcards to capture multiple connection string combinations:

domainId

The unique resource ID of the domain (or domains), in Collibra Data Intelligence Cloud, in which you want to ingest the Power BI assets.

Tip You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>.

Yes

description

Any description, as you see fit.

Yes

workspaceNames

The names of Power BI workspaces from which you want to ingest metadata.

Important Any meta-characters in the name of a workspace must be enclosed in square brackets "[ ]". For example, a workspace with the name "Sale and Marketing [automobiles]" should be formatted as follows:
Sale and Marketing [[]automobiles[]]

workspaceIds

The IDs of Power BI workspaces from which you want to ingest metadata.

capacityNames

The names of capacities on which you want to filter.

capacityIds

The IDs of capacities on which you want to filter.

Warning Any letters in a capacity ID must be in upper case.

Save the <source ID> configuration file.