Prepare Power BI <source ID> configuration file

The lineage harvester uses a lineage harvester configuration file to collect the Power BI data objects. It then sends the metadata to the Collibra Data Lineage service.

The <source ID> configuration file allows you to:

  • Specify the name of a database, on which server the database is running, and optionally, the name of the schema.
  • Configure workspace filtering.
    Tip We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters.
  • If useCollibraSystemName in the lineage harvester configuration file is set to true, use the collibraSystemName property to specify the system name of databases in Power BI. Collibra Data Lineage uses the system names to match the structure of databases in Power BI to assets in Data Catalog.
Example 

Steps

Tip Watch a video on how to do this:
  1. Create a new JSON file in the lineage harvester config folder.
  2. Give the JSON file the same name as the value of the sourceId property in the lineage harvester configuration file.
    Example The value of the sourceId property in the lineage harvester configuration file is power-bi-source-1. Therefore, the name of your JSON file should be power-bi-source-1.conf.
    Important Your JSON file must have the file extension .conf.
  3. For each database in Power BI, add the following content to the JSON file:

    Property

    Description

    Mandatory?

    found_dbname=<database name>;found_hostname=<server name>;found_schema=<schema name>

    The database information of supported data sources in Power BI that is typically collected by the lineage harvester. It allows you to specify the name of the database (found_dbname), on which server a database is running (found_hostname), and optionally, the name of the schema (found_schema).

    Tip 

    You can use wildcards to capture multiple connection string combinations:

    Yes

    dbname
    The name of the database of a supported data source in Power BI.

    No

    schema

    The name of the default schema of a supported data source in Power BI.

    If the lineage harvester fails to find a specific schema, it uses the default schema.

    No

    dialect

    The dialect of the supported data source in Power BI.

    No

    collibraSystemName

    The system or server name of a database.

    If you don't specify a value for this property, "DEFAULT" is shown in the technical lineage harvester.

    Warning The value of this property must exactly match (including for case-sensitivity) the name of your System asset in Collibra.

    Important If you are using a <source ID> configuration file for the purpose of providing the true system name of an ODBC database in Power BI, you are not required to:
    • Set the useCollibraSystemName property in the lineage harvester configuration file to true.
    • Specify a Collibra system name in the <source ID> configuration file.
    However, if the useCollibraSystemName property is set to true in the lineage harvester configuration file, then you must specify a Collibra system name in the <source ID> configuration file.

    Yes (unless you are using a <source ID> file to provide the true system names of ODBC databases in Power BI.)

    filters

    This section allows you to specify the Power BI workspaces from which you want to ingest metadata.

    The filters work as "workspace AND workspace AND capacity AND capacity", meaning that if you specify a capacity, all of the workspaces in that capacity are also ingested.

    Warning If you don't want to specify the Power BI workspaces from which to ingest, you must completely remove this filters section.

    Tip 

    You can use wildcards to capture multiple connection string combinations:

    No

    domainId

    The unique resource ID of the domain (or domains), in Collibra Data Intelligence Cloud, in which you want to ingest the Power BI assets.

    Tip You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>.

    Yes

    description

    Any description, as you see fit.

    Yes

    workspaceNames

    The names of Power BI workspaces from which you want to ingest metadata.

    Important Any meta-characters in the name of a workspace must be enclosed in square brackets "[ ]". For example, a workspace with the name "Sale and Marketing [automobiles]" should be formatted as follows:
    Sale and Marketing [[]automobiles[]]

    No

    workspaceIds

    The IDs of Power BI workspaces from which you want to ingest metadata.

    Tip We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters.

    No
    capacityNames

    The names of capacities on which you want to filter.

    No
    capacityIds

    The IDs of capacities on which you want to filter.

    Warning Any letters in a capacity ID must be in upper case.

    No
    excludeWorkspaceNames

    The names of Power BI workspaces that you want to exclude from the ingestion job.

    This is useful if you want to exclude, for example, dedicated development and testing workspaces.

    Note The metadata of inactive and personal workspaces is not harvested or uploaded to the Collibra Data Lineage service instance. An inactive workspace is one for which no reports or dashboards have been viewed in the past 60 days. My workspace is the personal workspace for any Power BI customer to work with their own, personal content.

    For complete details on the advantages, limitations and configuration considerations of this property, see Filtering Power BI workspaces.

    No
    excludeWorkspaceIds

    The IDs of Power BI workspaces that you want to exclude from the ingestion job.

    This is useful if you want to exclude, for example, dedicated development and testing workspaces.

    For complete details on the advantages, limitations and configuration considerations of this property, see Filtering Power BI workspaces.

    No
  4. Save the <source ID> configuration file.