Power BI source configuration

Updated: May 13, 2026

The Source configuration field in the Power BI technical lineage Edge capability allows you to:

Map the names of the server, database and schema that were collected by the lineage harvester to their true names.
Note Mapping doesn't work for custom SQL.
Configure filtering. We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters.
If useCollibraSystemName in the lineage harvester configuration file is set to true, use the collibraSystemName property to specify the system name of databases in Power BI. Collibra Data Lineage uses the system names to match the structure of databases in Power BI to assets in Data Catalog.

Tip

Filtering v2 allows you to filter on dashboards and reports — including in-app reports — in addition to capacities and workspaces. You are still free to use your v1 filter configuration. Both filter versions are addressed in this topic.
If you previously integrated Power BI via the lineage harvester (deprecated), you can copy and paste the JSON code from your <source ID> configuration file into the Source configuration field.

Source configuration with filtering v2

The value of the Source configuration field must be a valid block of JSON code, for example:

Copy

{
  "found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
    "dbname": "mssql-database-name",
    "schema": "mssql-schema-name",
    "dialect": "mssql",
    "collibraSystemName": "mssql-system-name"
  },
  "found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
    "dbname": "oracle-database-name",
    "schema": "oracle-schema-name",
    "dialect": "oracle",
    "collibraSystemName": "oracle-system-name"
  },
  "filters":[ 
    {
    "domainId": "default",
    "description": "Filter by display name",
    "workspaceFilter": {
      "includedNames": ["Test1", "Test2"]
    },
    "dashboardFilter": {
      "excludedNames": "*restricted*"
    },
    "reportFilter": {
      "excludedNames": ["report1", "report2", "report3"],
      "includedInApp": true
    }
  }
  ]
}

Filter validation

Filter configurations are validated against the following scenarios:

Duplicate keywords.
Unknown or unsupported keywords.
Contradicting inclusion and exclusion filters.
Mixed filter v1 and filter v2 keywords.
A single workspace is mapped to more than one domain. (In this case, only the first filter is considered.)

If validation fails for any of these scenarios, a warning with failure details is shown in an analyze error on the Technical lineage Sources tab page. Critical errors occur only if the <source ID> configuration file is incorrectly formatted or doesn’t contain valid keywords. In such cases, the filter configuration is not processed. If configured inclusion and exclusion filters are contradicting, only the exclusion filter is taken into consideration.

Source configuration properties

The following table describes the various properties you can use in your JSON code block.

Property

Description

Mandatory?

found_dbname=<database name>;found_hostname=<server name>;found_schema=<schema name>

The database information of supported data sources in Power BI that is typically collected by the lineage harvester. Specify the name of the database (found_dbname), on which server a database is running (found_hostname), and optionally, the name of the schema (found_schema). You then use the child properties to map the names collected by the lineage harvester to the true names.

Important The keys that you specify must be unique. The following configuration would result in an error, because the key found_dbname=databasename1;found_hostname=*;found_schema=schema1 is specified twice.

{
  "found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
    "dbname": "mssql-database-name",
    "schema": "mssql-schema-name",
    "dialect": "mssql",
    "collibraSystemName": "mssql-system-name"
  },
  "found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
    "dbname": "oracle-database-name",
    "schema": "oracle-schema-name",
    "dialect": "oracle",
    "collibraSystemName": "oracle-system-name"
  }
}

During metadata analysis, if Collibra Data Lineage cannot match a name that you provide in this mapping – let's say, for example, you mistype the name of the database – an analyze error is produced.

You can use wildcards to capture multiple connection string combinations:

Pattern	Description
*	Matches everything.
?	Matches any single character.
[seq]	Matches any character in "seq".
[!seq]	Matches any character not in "seq".

dbname

The true name (display name) of the database collected by the lineage harvester.

schema

The true name (display name) of the schema collected by the lineage harvester.

If the lineage harvester fails to find a specific schema, it uses the schema you specify in this property.

Important Schema mapping is available for schemas that come from Power Query connections. It is not available, however, if a Power Query connection is created with SQL (or MDX) statements and the schema is specified in those statements.

dialect

The dialect of the supported data source in Power BI.

collibraSystemName

The system or server name of a database.

Warning The value of this property must exactly match (including for case-sensitivity) the name of your System asset in Collibra.

If you set the useCollibraSystemName property to true in your lineage harvester configuration file, but you either don't create a <source ID> configuration file, or don't specify a value for the collibraSystemName property in your <source ID> configuration file, the system name in the technical lineage is "DEFAULT".

How to configure this property if you have two databases with the same name

Let's assume you have two databases named Customers. When you prepare the physical data layer in Data Catalog, you create a System asset for each of these databases. Let's say you named them Customers-Europe and Customers-USA. You can then configure this property as follows.

"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
  "dbname": "Customers",
  "schema": "mssql-schema-name",
  "dialect": "mssql",
  "collibraSystemName": "Customers-Europe"
},
"found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
  "dbname": "Customers",
  "schema": "oracle-schema-name",
  "dialect": "oracle",
  "collibraSystemName": "Customers-USA"
},

Yes

filters

This section allows you to specify the Power BI workspaces from which you want to ingest metadata.

If you specify a capacity, all of the workspaces in that capacity are also ingested.

Workspace filtering takes precedence over capacity filtering, meaning workspaces are filtered first. If there is no explicit exclusion of capacities containing workspaces, all capacities containing workspaces are ingested. Filtering of reports and dashboards is subordinate to workspace filtering, meaning that to include reports and dashboards from a certain workspace, that workspace has be ingested as well. Reports and dashboards from a single workspace cannot be ingested in different domains. Any configured dashboard and report filtering is then taken into consideration.

Any meta-characters in the name of a workspace must be enclosed in square brackets "[ ]". For example, a workspace with the name Sale and Marketing [automobiles] must be formatted as follows:
Sale and Marketing [[]automobiles[]]

Important If you don't want to specify the Power BI workspaces from which to ingest, you must completely remove this filters section.

You can use wildcards to capture multiple connection string combinations:

Pattern	Description
*	Matches everything.
?	Matches any single character.
[seq]	Matches any character in "seq".
[!seq]	Matches any character not in "seq".

domainId

The unique resource ID of the domain (or domains), in Collibra Platform, in which you want to ingest the Power BI assets.

You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>.

Yes

description

Any description, as you see fit.

capacityFilter

This section allows you to specify the capacities from which you want to ingest metadata. You can include certain capacities and exclude others.

includedNames

The names of the capacities from which you want to ingest metadata.

includedIds

The IDs of the capacities from which you want to ingest metadata.

excludedNames

The names of the capacities that you want to exclude from metadata ingestion.

excludedIds

The IDs of the capacities that you want to exclude from metadata ingestion.

workspaceFilter

This section allows you to specify the workspaces from which you want to ingest metadata. You can include certain workspaces and exclude others.

We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters.

includedNames

The names of the workspaces from which you want to ingest metadata.

includedIds

The IDs of the workspaces from which you want to ingest metadata.

excludedNames

The names of the workspaces that you want to exclude from metadata ingestion.

This is useful if you want to exclude, for example, dedicated development and testing workspaces.

The metadata of inactive and personal workspaces is not harvested or uploaded to the Collibra Data Lineage service instance. An inactive workspace is one for which no reports or dashboards have been viewed in the past 60 days. My workspace is the personal workspace for any Power BI customer to work with their own, personal content.

excludedIds

The IDs of the workspaces that you want to exclude from metadata ingestion.

dashboardFilter

This section allows you to specify the dashboards from which you want to ingest metadata. You can include certain dashboards and exclude others.

includedNames

The names of the dashboards from which you want to ingest metadata.

includedIds

The IDs of the dashboards from which you want to ingest metadata.

excludedNames

The names of the dashboards that you want to exclude from metadata ingestion.

excludedIds

The IDs of the dashboards that you want to exclude from metadata ingestion.

reportFilter

This section allows you to specify the reports from which you want to ingest metadata. You can include certain reports and exclude others.

includedNames

The names of the reports from which you want to ingest metadata.

includedIds

The IDs of the reports from which you want to ingest metadata.

excludedNames

The names of the reports that you want to exclude from metadata ingestion.

excludedIds

The IDs of the reports that you want to exclude from metadata ingestion.

createAppReports

Use this keyword to specify that you don't want to ingest the in-app versions of reports.

If "createAppReports": false, in-app versions of reports are not ingested. Only the original reports are ingested.

If "createAppReports": true, in-app versions of reports are ingested, along with the original reports.

If you don't use the createAppReports keyword, in-app versions of reports are ingested, along with the original reports.

includedInApp

Use this keyword to specify how you want Collibra Data Lineage to address reports that are included in published Power BI apps.

If "includedInApp": true:

Original reports that are included in an app are ingested.
In-app versions of reports are ingested, unless "createAppReports": false.
Reports that are not included in an app are not ingested.

If "includedInApp": false, only reports that are not included in an app are ingested.

If you don't use the includedInApp keyword, all reports are ingested, including:

Reports that are included in apps, along with their in-app versions.
Reports that are not included in an app.

How this property works in conjunction with the "createAppReports" property

Let's say that you have 8 reports in Power BI:

5 reports are not included in an app.
3 reports are included in a app, which means there are also 3 in-app versions of these reports.

The following table shows which of these reports are ingested, based on how you use the 2 keywords.

"createAppReports": true (or not used) "createAppReports": false

includedInApp is not used

11 reports are ingested:

The 5 reports not included in an app.
The 3 original reports included in an app.
The 3 in-app versions.

8 reports are ingested:

The 5 reports not included in an app.
The 3 original reports included in an app:

The 3 in-app versions of these reports are not created or ingested.

"includedInApp": true

6 reports are ingested:

The 3 original reports included in an app.
The 3 in-app versions.

3 reports are ingested:

The 3 original reports included in an app.

The 3 in-app versions of these reports are not created or ingested.

"includedInApp": false

5 reports are ingested:

The 5 reports not included in an app.

5 reports are ingested:

The 5 reports not included in an app.

Considerations

Workspace filtering takes precedence over capacity filtering, meaning workspaces are filtered first. Report filtering and dashboard filtering are subordinate to both capacity filtering and workspace filtering.

Capacities that are empty after workspace filtering and do not pass a filter are excluded. In the following example, workspace "workspace_1" is in capacity "CAPACITY_A". The metadata also includes a capacity "CAPACITY_B", but it is not mentioned for inclusion in the filtering, so it is empty. Only "CAPACITY_A" is included.

Copy

{
  "filters": [
    {
      "description": "description",
      "domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
      "capacityFilter": {
        "includedNames": ["CAPACITY_A"],
      },
      "workspaceFilter": {
        "includedNames": ["workspace_1"]
      }
    }
  ]
}

However, if there is no capacity filtering, all capacities are included, even if one or more capacities contain no workspaces due to filtering. This is because all capacities are treated as "included" in filters, unless otherwise specified. In the following example, workspace "workspace_1:" is in capacity "CAPACITY_A". The metadata also includes a capacity "CAPACITY_B". Because there is no capacity filtering, both capacities are included.

Copy

{
  "filters": [
    {
      "description": "description",
      "domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
      "workspaceFilter": {
        "includedNames": ["workspace_1"]
      }
    }
  ]
}

Inclusion and exclusion properties used with the includedInApp property for reports are applied using the AND logical operator. In the following example,“report1”, which is published in an app, is included. “report2”, which is not published in an app, is not included.

Copy

"reportFilter": {
  "includedNames": ["report1", "report2"]
  "includedInApp": true
}

Examples

In the following example:

Only reports with names that match ABC report* and are in workspace ABC1 are included.
Reports that are not in workspace ABC1 are not included.
Reports that are in capacity ABC Capacity are not included.

Copy

{
  "domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
  "description": "Filter by display name",
  "capacityFilter": {
    "excludedNames": ["ABC Capacity"]
  },
  "workspaceFilter": {
     "includedNames": ["ABC1"]
  },
  "reportFilter": {
    "includedNames": ["ABC report*"]
  }
}

In the following example, reports with names that match ABC report*, in any workspace, are included.

Copy

{
  "domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
  "description": "Filter by display name",
  "reportFilter": {
    "includedNames": ["ABC report*"]
  }
}

For report filtering, inclusion and exclusion filters used in combination with the includedInApp property are applied using the AND logical operator. In the following example:

In-app report named report1 is included.
Let's say that a report named report2 is not in an app. That report is not included.

Copy

"reportFilter": {
    "includedNames": ["report1", "report2"],
    "includedInApp": true
  }

In the following example, all reports with names that match report1* are included, with the exception of report report1_backup.

Copy

{
  "filters": [
    {
      "domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
      "description": "Some description",
      "reportFilter": {
        "includedNames": "report1*",
        "excludedNames": "*_backup"
      }
    }
  ]
}

In the following example, all reports named report1 in workspace workspace_name_1 (only) are included.

Copy

{
  "filters": [
    {
      "domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
      "description": "description",
      "workspaceFilter": {
        "includedNames": "workspace_name_1"
      },
      "reportFilter": {
        "includedNames": "report1"
      }
    }
  ]
}

In the following example, all workspaces in capacity capacity1 and workspace workspace_name_1 are included.

Copy

{
  "filters": [
    {
      "capacityFilter": {
        "includedNames": "capacity1"
      },
      "workspaceFilter": {
        "includedNames": "workspace_name_1"
      },
      "description": "workspace and capacity filter",
      "domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356"
    }
  ]
}

Filter configuration validation

There are several validation rules to ensure that filter configurations are valid and non-contradictory. Failure to pass validation does not affect the integration; rather warnings are generated and included in analyze errors and the logs.

In the following example, the same workspace is specified for inclusion and exclusion. If this case, the exclusion filter takes precedence, meaning workspace ABC2 is not included.

Copy

"workspaceFilter": {
     "includedNames": ["ABC2"],
     "excludedNames": ["ABC2"]
  }

The following error is the same scenario as in the previous example, except that wildcards are used. The result is the same, meaning workspace ABC2 is not included.

Copy

"workspaceFilter": {
     "includedNames": ["ABC*"],
     "excludedNames": ["ABC2"]
  }

In the following example, a warning is included in an analysis error because workspace ABC2 is specified in multiple filters.

Copy

"workspaceFilter": {
     "includedNames": ["ABC2"]
  },
  "workspaceFilter": {
     "includedNames": ["ABC2", "ABC3"]
  }

The includedInApp property is valid only for reports. meaning in a reportFilter section. In the following example, an analysis error is generated because it is used in the dashboardFilter section.

Copy

"dashboardFilter": {
    "includedInApp": true
  }

In the following example, a report Test report would qualify for inclusion because it passes both report includedInApp and report includedNames properties. However, due to the order of filtering, the report was already excluded before the inclusion properties were considered. Therefore, the report Test report is not included.

In this case, the report Test report (and any other reports that match the inclusion criteria but will not be included) is considered to have an "absent" parent. Configurations that result in dashboards or reports with absent parents result in analysis errors, and metadata of such dashboards and reports are not ingested.

Copy

{
  "filters": [
    {
      "domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
      "description": "Filter by display name",
      "capacityFilter": {
        "excludedNames": "Excluded Capacity"
      },
      "workspaceFilter": {
        "excludedNames": "Test1"
      },
      "reportFilter": {
        "includedInApp": true
      }
    },
    {
      "domainId": "default",
      "description": "Filter by display name",
      "reportFilter": {
        "includedNames": "Test report*"
      }
    }
  ]
}

Continuing with this example, if you want to include report Test report in one domain, but not another, consider the following configuration:

Copy

{
  "filters": [
    {
      "domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
      "description": "Filter by display name",
      "workspaceFilter": {
        "excludedNames": ["Test1"]
      }
    },
    {
      "domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
      "description": "Filter by display name",
      "workspaceFilter": {
        "includedNames": ["Test1"]
      },
      "reportFilter": {
        "includedNames": [
          "Test report"
        ]
      }
    }
  ]
}

In this case, in the domain with ID ending "1356", neither the capacity nor the workspace that includes the report Test report is included. Therefore, you can include the already excluded workspace in the second filter, for the domain with ID ending "c46f".

Warnings in generated analysis errors about "absent" parents can help explain filtering behavior. In the following example, workspace filtering happens first, so the report Report in app is ingested in the domain with ID ending "c46f", thereby rendering obsolete the first filter, a report filter that targets the default domain.

Copy

{
  "filters": [
    {
      "domainId": "default",
      "description": "Filter by display name",
      "reportFilter": {
        "includedInApp": true
      }
    },
    {
      "domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
      "description": "Filter by display name",
      "workspaceFilter": {
        "includedNames": ["workspace_name_1"]
                        },
      "reportFilter": {
        "includedNames": ["Report in app"]
      }
    }
  ]
}

If you want to ingest reports into multiple domains, the following example shows the recommended configuration.

Copy

{
  "filters": [
    {
      "domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
      "description": "Filter by display name",
      "workspaceFilter": {
        "includedNames": ["workspace_name_1"]
      },
      "reportFilter": {
         "includedInApp": true
      }
    },
    {
      "domainId": "default",
      "description": "Filter by display name",
      "workspaceFilter": {
        "includedNames": ["workspace_name_2"]
    },
      "reportFilter": {
        "includedInApp": true
      }
    }
  ]
}

Source configuration with filtering v1

The value of the Source configuration field must be a valid block of JSON code, for example:

Copy

{
    "found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
        "dbname": "mssql-database-name",
        "schema": "mssql-schema-name",
        "dialect": "mssql",
        "collibraSystemName": "mssql-system-name"
    },
    "found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
        "dbname": "oracle-database-name",
        "schema": "oracle-schema-name",
        "dialect": "oracle",
        "collibraSystemName": "oracle-system-name"
    },
    "filters":[ 
        {
        "domainId": "<domain-ref-id>",
        "description": "FirstFilter",
        "workspaceNames": ["*"],
        "excludeWorkspaceIds": ["workspaceC", "workspaceD"]
        },
        {
        "domainId": "<domain-ref-id>",
        "description": "SecondFilter",
        "workspaceNames": ["workspace3", "workspace4"],
        "capacityIds": ["id1","id2"]
        }
    ]
}

The following table describes the various properties you can use in your JSON code block.

Property

Description

Mandatory?

found_dbname=<database name>;found_hostname=<server name>;found_schema=<schema name>

{
	"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
		"dbname": "mssql-database-name",
		"schema": "mssql-schema-name",
		"dialect": "mssql",
		"collibraSystemName": "mssql-system-name"
	},
	"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
		"dbname": "oracle-database-name",
		"schema": "oracle-schema-name",
		"dialect": "oracle",
		"collibraSystemName": "oracle-system-name"
	}
}

You can use wildcards to capture multiple connection string combinations:

Pattern	Description
*	Matches everything.
?	Matches any single character.
[seq]	Matches any character in "seq".
[!seq]	Matches any character not in "seq".

dbname

The true name (display name) of the database collected by the lineage harvester.

schema

The true name (display name) of the schema collected by the lineage harvester.

If the lineage harvester fails to find a specific schema, it uses the schema you specify in this property.

dialect

The dialect of the supported data source in Power BI.

collibraSystemName

The system or server name of a database.

Warning The value of this property must exactly match (including for case-sensitivity) the name of your System asset in Collibra.

How to configure this property if you have two databases with the same name

"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
	"dbname": "Customers",
	"schema": "mssql-schema-name",
	"dialect": "mssql",
	"collibraSystemName": "Customers-Europe"
},
"found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
	"dbname": "Customers",
	"schema": "oracle-schema-name",
	"dialect": "oracle",
	"collibraSystemName": "Customers-USA"
},

Yes

filters

This section allows you to specify the Power BI workspaces from which you want to ingest metadata.

If you specify a capacity, all of the workspaces in that capacity are also ingested.

Important If you don't want to specify the Power BI workspaces from which to ingest, you must completely remove this filters section.

You can use wildcards to capture multiple connection string combinations:

Pattern	Description
*	Matches everything.
?	Matches any single character.
[seq]	Matches any character in "seq".
[!seq]	Matches any character not in "seq".

domainId

The unique resource ID of the domain (or domains), in Collibra Platform, in which you want to ingest the Power BI assets.

You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>.

Yes

description

Any description, as you see fit.

workspaceNames

The names of Power BI workspaces from which you want to ingest metadata.

Any meta-characters in the name of a workspace must be enclosed in square brackets "[ ]". For example, a workspace with the name "Sale and Marketing [automobiles]" should be formatted as follows:
Sale and Marketing [[]automobiles[]]

workspaceIds

The IDs of Power BI workspaces from which you want to ingest metadata.

We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters.

capacityNames

The names of capacities on which you want to filter.

capacityIds

The IDs of capacities on which you want to filter.

Important All letters in a capacity ID must be in upper case.

excludeWorkspaceNames

The names of Power BI workspaces that you want to exclude from the ingestion job.

This is useful if you want to exclude, for example, dedicated development and testing workspaces.

For complete details on the advantages, limitations and configuration considerations of this property, see Filtering Power BI workspaces.

excludeWorkspaceIds

The IDs of Power BI workspaces that you want to exclude from the ingestion job.

This is useful if you want to exclude, for example, dedicated development and testing workspaces.

For complete details on the advantages, limitations and configuration considerations of this property, see Filtering Power BI workspaces.

Filter configuration validation

In the following example, the same workspace is specified for inclusion and exclusion. If this case, the exclusion filter takes precedence, meaning workspace ABC2 is not included.

Copy

"workspaceNames": ["ABC2"],
"excludeWorkspaceNames": ["ABC2"]

The following error is the same scenario as in the previous example, except that wildcards are used. The result is the same, meaning workspace ABC2 is not included.

Copy

"workspaceNames": ["ABC*"],
"excludeWorkspaceNames": ["ABC2"]

In the following example, a warning is included in an analysis error because workspace ABC2 is specified in multiple filters.

Copy

{
  "domainId": "<domain-ref-id>",
  "description": "FirstFilter",
  "workspaceNames": ["ABC2"]
},
{
  "domainId": "<domain-ref-id>",
  "description": "SecondFilter",
  "workspaceNames": ["ABC2", "ABC3"]
}