Power BI source configuration
Updated:Note This topic is only relevant if you are creating technical lineage via Edge. If you are using the CLI lineage harvester (deprecated), you need to create a <source ID> configuration file. The CLI harvester will officially reach its End of Life on July 31, 2026.
The Source configuration field in the Power BI technical lineage Edge capability allows you to:
-
Map the names of the server, database and schema that were collected by the lineage harvester to their true names.Note Mapping doesn't work for custom SQL.
- Configure filtering. We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters.
-
If
useCollibraSystemNamein the lineage harvester configuration file is set totrue, use thecollibraSystemNameproperty to specify the system name of databases in Power BI. Collibra Data Lineage uses the system names to match the structure of databases in Power BI to assets in Data Catalog.
Tip Filtering v2 allows you to filter on dashboards and reports — including in-app reports — in addition to capacities and workspaces. You are still free to use your v1 filter configuration. Both filter versions are addressed in this topic.
Source configuration with filtering v2
The value of the Source configuration field must be a valid block of JSON code, for example:
{
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "mssql-database-name",
"schema": "mssql-schema-name",
"dialect": "mssql",
"collibraSystemName": "mssql-system-name"
},
"found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
"dbname": "oracle-database-name",
"schema": "oracle-schema-name",
"dialect": "oracle",
"collibraSystemName": "oracle-system-name"
},
"filters":[
{
"domainId": "default",
"description": "Filter by display name",
"workspaceFilter": {
"includedNames": ["Test1", "Test2"]
},
"dashboardFilter": {
"excludedNames": "*restricted*"
},
"reportFilter": {
"excludedNames": ["report1", "report2", "report3"],
"includedInApp": true
}
}
]
}
Filter validation
Filter configurations are validated against the following scenarios:
- Duplicate keywords.
- Unknown or unsupported keywords.
- Contradicting inclusion and exclusion filters.
- Mixed filter v1 and filter v2 keywords.
- A single workspace is mapped to more than one domain. (In this case, only the first filter is considered.)
If validation fails for any of these scenarios, a warning with failure details is shown in an analyze error on the Technical lineage Sources tab page. Critical errors occur only if the <source ID> configuration file is incorrectly formatted or doesn’t contain valid keywords. In such cases, the filter configuration is not processed. If configured inclusion and exclusion filters are contradicting, only the exclusion filter is taken into consideration.
Source configuration properties
The following table describes the various properties you can use in your JSON code block.
|
Property |
Description |
Mandatory? |
||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
found_dbname=<database name>;found_hostname=<server name>;found_schema=<schema name> |
The database information of supported data sources in Power BI that is typically collected by the lineage harvester. Specify the name of the database ( Important The keys that you specify must be unique. The following configuration would result in an error, because the key
found_dbname=databasename1;found_hostname=*;found_schema=schema1 is specified twice.{
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "mssql-database-name",
"schema": "mssql-schema-name",
"dialect": "mssql",
"collibraSystemName": "mssql-system-name"
},
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "oracle-database-name",
"schema": "oracle-schema-name",
"dialect": "oracle",
"collibraSystemName": "oracle-system-name"
}
}During metadata analysis, if Collibra Data Lineage cannot match a name that you provide in this mapping – let's say, for example, you mistype the name of the database – an analyze error is produced.
How to view the analyze error
You can use wildcards to capture multiple connection string combinations:
|
No |
||||||||||||
|
dbname
|
The true name (display name) of the database collected by the lineage harvester. |
No |
||||||||||||
|
schema
|
The true name (display name) of the schema collected by the lineage harvester. If the lineage harvester fails to find a specific schema, it uses the schema you specify in this property. Important Schema mapping is available for schemas that come from Power Query connections. It is not available, however, if a Power Query connection is created with SQL (or MDX) statements and the schema is specified in those statements. |
No |
||||||||||||
|
dialect
|
The dialect of the supported data source in Power BI. Show a list of dialects of supported data sources in Power BI.
|
No |
||||||||||||
|
collibraSystemName
|
The system or server name of a database. Warning The value of this property must exactly match (including for case-sensitivity) the name of your System asset in Collibra. If you set the How to configure this property if you have two databases with the same name Let's assume you have two databases named Customers. When you prepare the physical data layer in Data Catalog, you create a System asset for each of these databases. Let's say you named them Customers-Europe and Customers-USA. You can then configure this property as follows.
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "Customers",
"schema": "mssql-schema-name",
"dialect": "mssql",
"collibraSystemName": "Customers-Europe"
},
"found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
"dbname": "Customers",
"schema": "oracle-schema-name",
"dialect": "oracle",
"collibraSystemName": "Customers-USA"
},
|
Yes |
||||||||||||
|
This section allows you to specify the Power BI workspaces from which you want to ingest metadata. If you specify a capacity, all of the workspaces in that capacity are also ingested. Workspace filtering takes precedence over capacity filtering, meaning workspaces are filtered first. If there is no explicit exclusion of capacities containing workspaces, all capacities containing workspaces are ingested. Filtering of reports and dashboards is subordinate to workspace filtering, meaning that to include reports and dashboards from a certain workspace, that workspace has be ingested as well. Reports and dashboards from a single workspace cannot be ingested in different domains. Any configured dashboard and report filtering is then taken into consideration. Any meta-characters in the name of a workspace must be enclosed in square brackets "[ ]". For example, a workspace with the name Important If you don't want to specify the Power BI workspaces from which to ingest, you must completely remove this filters section. You can use wildcards to capture multiple connection string combinations:
|
No |
|||||||||||||
|
The unique resource ID of the domain (or domains), in Collibra Platform, in which you want to ingest the Power BI assets. You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>. |
Yes |
|||||||||||||
|
description
|
Any description, as you see fit. |
No |
||||||||||||
|
capacityFilter
|
This section allows you to specify the capacities from which you want to ingest metadata. You can include certain capacities and exclude others. |
|||||||||||||
|
includedNames
|
The names of the capacities from which you want to ingest metadata. | No | ||||||||||||
|
includedIds
|
The IDs of the capacities from which you want to ingest metadata. | No | ||||||||||||
|
excludedNames
|
The names of the capacities that you want to exclude from metadata ingestion. | No | ||||||||||||
|
excludedIds
|
The IDs of the capacities that you want to exclude from metadata ingestion. | No | ||||||||||||
|
workspaceFilter
|
This section allows you to specify the workspaces from which you want to ingest metadata. You can include certain workspaces and exclude others. We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters. |
No |
||||||||||||
|
includedNames
|
The names of the workspaces from which you want to ingest metadata. | No | ||||||||||||
|
includedIds
|
The IDs of the workspaces from which you want to ingest metadata. | No | ||||||||||||
|
excludedNames
|
The names of the workspaces that you want to exclude from metadata ingestion. This is useful if you want to exclude, for example, dedicated development and testing workspaces. The metadata of inactive and personal workspaces is not harvested or uploaded to the Collibra Data Lineage service instance. An inactive workspace is one for which no reports or dashboards have been viewed in the past 60 days. My workspace is the personal workspace for any Power BI customer to work with their own, personal content. |
No | ||||||||||||
|
excludedIds
|
The IDs of the workspaces that you want to exclude from metadata ingestion. |
No | ||||||||||||
|
dashboardFilter
|
This section allows you to specify the dashboards from which you want to ingest metadata. You can include certain dashboards and exclude others. |
No | ||||||||||||
|
includedNames
|
The names of the dashboards from which you want to ingest metadata. | No | ||||||||||||
|
includedIds
|
The IDs of the dashboards from which you want to ingest metadata. | No | ||||||||||||
|
excludedNames
|
The names of the dashboards that you want to exclude from metadata ingestion. | No | ||||||||||||
|
excludedIds
|
The IDs of the dashboards that you want to exclude from metadata ingestion. | No | ||||||||||||
|
reportFilter
|
This section allows you to specify the reports from which you want to ingest metadata. You can include certain reports and exclude others. |
No | ||||||||||||
|
includedNames
|
The names of the reports from which you want to ingest metadata. | No | ||||||||||||
|
includedIds
|
The IDs of the reports from which you want to ingest metadata. | No | ||||||||||||
|
excludedNames
|
The names of the reports that you want to exclude from metadata ingestion. | No | ||||||||||||
|
excludedIds
|
The IDs of the reports that you want to exclude from metadata ingestion. | No | ||||||||||||
|
createAppReports
|
Use this keyword to specify that you don't want to ingest the in-app versions of reports.
If
If If you don't use the |
|||||||||||||
|
includedInApp
|
Use this keyword to specify how you want Collibra Data Lineage to address reports that are included in published Power BI apps. If
If If you don't use the
How this property works in conjunction with the "createAppReports" property Let's say that you have 8 reports in Power BI:
The following table shows which of these reports are ingested, based on how you use the 2 keywords.
|
No |
Considerations
Workspace filtering takes precedence over capacity filtering, meaning workspaces are filtered first. Report filtering and dashboard filtering are subordinate to both capacity filtering and workspace filtering.
Capacities that are empty after workspace filtering and do not pass a filter are excluded. In the following example, workspace "workspace_1" is in capacity "CAPACITY_A". The metadata also includes a capacity "CAPACITY_B", but it is not mentioned for inclusion in the filtering, so it is empty. Only "CAPACITY_A" is included.
{
"filters": [
{
"description": "description",
"domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
"capacityFilter": {
"includedNames": ["CAPACITY_A"],
},
"workspaceFilter": {
"includedNames": ["workspace_1"]
}
}
]
}
However, if there is no capacity filtering, all capacities are included, even if one or more capacities contain no workspaces due to filtering. This is because all capacities are treated as "included" in filters, unless otherwise specified. In the following example, workspace "workspace_1:" is in capacity "CAPACITY_A". The metadata also includes a capacity "CAPACITY_B". Because there is no capacity filtering, both capacities are included.
{
"filters": [
{
"description": "description",
"domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
"workspaceFilter": {
"includedNames": ["workspace_1"]
}
}
]
}
Inclusion and exclusion properties used with the includedInApp property for reports are applied using the AND logical operator. In the following example,“report1”, which is published in an app, is included. “report2”, which is not published in an app, is not included.
"reportFilter": {
"includedNames": ["report1", "report2"]
"includedInApp": true
}
Examples
In the following example:
-
Only reports with names that match
ABC report*and are in workspaceABC1are included. - Reports that are not in workspace
ABC1are not included. - Reports that are in capacity
ABC Capacityare not included.
{
"domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
"description": "Filter by display name",
"capacityFilter": {
"excludedNames": ["ABC Capacity"]
},
"workspaceFilter": {
"includedNames": ["ABC1"]
},
"reportFilter": {
"includedNames": ["ABC report*"]
}
}
In the following example, reports with names that match ABC report*, in any workspace, are included.
{
"domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
"description": "Filter by display name",
"reportFilter": {
"includedNames": ["ABC report*"]
}
}
For report filtering, inclusion and exclusion filters used in combination with the includedInApp property are applied using the AND logical operator. In the following example:
-
In-app report named
report1is included. - Let's say that a report named
report2is not in an app. That report is not included.
"reportFilter": {
"includedNames": ["report1", "report2"],
"includedInApp": true
}
In the following example, all reports with names that match report1* are included, with the exception of report report1_backup.
{
"filters": [
{
"domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
"description": "Some description",
"reportFilter": {
"includedNames": "report1*",
"excludedNames": "*_backup"
}
}
]
}
In the following example, all reports named report1 in workspace workspace_name_1 (only) are included.
{
"filters": [
{
"domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
"description": "description",
"workspaceFilter": {
"includedNames": "workspace_name_1"
},
"reportFilter": {
"includedNames": "report1"
}
}
]
}
In the following example, all workspaces in capacity capacity1 and workspace workspace_name_1 are included.
{
"filters": [
{
"capacityFilter": {
"includedNames": "capacity1"
},
"workspaceFilter": {
"includedNames": "workspace_name_1"
},
"description": "workspace and capacity filter",
"domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356"
}
]
}
Filter configuration validation
There are several validation rules to ensure that filter configurations are valid and non-contradictory. Failure to pass validation does not affect the integration; rather warnings are generated and included in analyze errors and the logs.
In the following example, the same workspace is specified for inclusion and exclusion. If this case, the exclusion filter takes precedence, meaning workspace ABC2 is not included.
"workspaceFilter": {
"includedNames": ["ABC2"],
"excludedNames": ["ABC2"]
}
The following error is the same scenario as in the previous example, except that wildcards are used. The result is the same, meaning workspace ABC2 is not included.
"workspaceFilter": {
"includedNames": ["ABC*"],
"excludedNames": ["ABC2"]
}
In the following example, a warning is included in an analysis error because workspace ABC2 is specified in multiple filters.
"workspaceFilter": {
"includedNames": ["ABC2"]
},
"workspaceFilter": {
"includedNames": ["ABC2", "ABC3"]
}
The includedInApp property is valid only for reports. meaning in a reportFilter section. In the following example, an analysis error is generated because it is used in the dashboardFilter section.
"dashboardFilter": {
"includedInApp": true
}
In the following example, a report Test report would qualify for inclusion because it passes both report includedInApp and report includedNames properties. However, due to the order of filtering, the report was already excluded before the inclusion properties were considered. Therefore, the report Test report is not included.
In this case, the report Test report (and any other reports that match the inclusion criteria but will not be included) is considered to have an "absent" parent. Configurations that result in dashboards or reports with absent parents result in analysis errors, and metadata of such dashboards and reports are not ingested.
{
"filters": [
{
"domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
"description": "Filter by display name",
"capacityFilter": {
"excludedNames": "Excluded Capacity"
},
"workspaceFilter": {
"excludedNames": "Test1"
},
"reportFilter": {
"includedInApp": true
}
},
{
"domainId": "default",
"description": "Filter by display name",
"reportFilter": {
"includedNames": "Test report*"
}
}
]
}
Continuing with this example, if you want to include report Test report in one domain, but not another, consider the following configuration:
{
"filters": [
{
"domainId": "12g6d0dc-8291-476a-9bb0-9b13g6cc1356",
"description": "Filter by display name",
"workspaceFilter": {
"excludedNames": ["Test1"]
}
},
{
"domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
"description": "Filter by display name",
"workspaceFilter": {
"includedNames": ["Test1"]
},
"reportFilter": {
"includedNames": [
"Test report"
]
}
}
]
}
In this case, in the domain with ID ending "1356", neither the capacity nor the workspace that includes the report Test report is included. Therefore, you can include the already excluded workspace in the second filter, for the domain with ID ending "c46f".
Warnings in generated analysis errors about "absent" parents can help explain filtering behavior. In the following example, workspace filtering happens first, so the report Report in app is ingested in the domain with ID ending "c46f", thereby rendering obsolete the first filter, a report filter that targets the default domain.
{
"filters": [
{
"domainId": "default",
"description": "Filter by display name",
"reportFilter": {
"includedInApp": true
}
},
{
"domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
"description": "Filter by display name",
"workspaceFilter": {
"includedNames": ["workspace_name_1"]
},
"reportFilter": {
"includedNames": ["Report in app"]
}
}
]
}
If you want to ingest reports into multiple domains, the following example shows the recommended configuration.
{
"filters": [
{
"domainId": "d0f2966c-018b-4e8a-9085-266b3c01c46f",
"description": "Filter by display name",
"workspaceFilter": {
"includedNames": ["workspace_name_1"]
},
"reportFilter": {
"includedInApp": true
}
},
{
"domainId": "default",
"description": "Filter by display name",
"workspaceFilter": {
"includedNames": ["workspace_name_2"]
},
"reportFilter": {
"includedInApp": true
}
}
]
}
Source configuration with filtering v1
The value of the Source configuration field must be a valid block of JSON code, for example:
{
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "mssql-database-name",
"schema": "mssql-schema-name",
"dialect": "mssql",
"collibraSystemName": "mssql-system-name"
},
"found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
"dbname": "oracle-database-name",
"schema": "oracle-schema-name",
"dialect": "oracle",
"collibraSystemName": "oracle-system-name"
},
"filters":[
{
"domainId": "<domain-ref-id>",
"description": "FirstFilter",
"workspaceNames": ["*"],
"excludeWorkspaceIds": ["workspaceC", "workspaceD"]
},
{
"domainId": "<domain-ref-id>",
"description": "SecondFilter",
"workspaceNames": ["workspace3", "workspace4"],
"capacityIds": ["id1","id2"]
}
]
}
The following table describes the various properties you can use in your JSON code block.
|
Property |
Description |
Mandatory? |
||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
found_dbname=<database name>;found_hostname=<server name>;found_schema=<schema name> |
The database information of supported data sources in Power BI that is typically collected by the lineage harvester. Specify the name of the database ( Important The keys that you specify must be unique. The following configuration would result in an error, because the key
found_dbname=databasename1;found_hostname=*;found_schema=schema1 is specified twice.{
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "mssql-database-name",
"schema": "mssql-schema-name",
"dialect": "mssql",
"collibraSystemName": "mssql-system-name"
},
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "oracle-database-name",
"schema": "oracle-schema-name",
"dialect": "oracle",
"collibraSystemName": "oracle-system-name"
}
}During metadata analysis, if Collibra Data Lineage cannot match a name that you provide in this mapping – let's say, for example, you mistype the name of the database – an analyze error is produced.
How to view the analyze error
You can use wildcards to capture multiple connection string combinations:
|
No |
||||||||||
|
dbname
|
The true name (display name) of the database collected by the lineage harvester. |
No |
||||||||||
|
schema
|
The true name (display name) of the schema collected by the lineage harvester. If the lineage harvester fails to find a specific schema, it uses the schema you specify in this property. Important Schema mapping is available for schemas that come from Power Query connections. It is not available, however, if a Power Query connection is created with SQL (or MDX) statements and the schema is specified in those statements. |
No |
||||||||||
|
dialect
|
The dialect of the supported data source in Power BI. Show a list of dialects of supported data sources in Power BI.
|
No |
||||||||||
|
collibraSystemName
|
The system or server name of a database. Warning The value of this property must exactly match (including for case-sensitivity) the name of your System asset in Collibra. If you set the How to configure this property if you have two databases with the same name Let's assume you have two databases named Customers. When you prepare the physical data layer in Data Catalog, you create a System asset for each of these databases. Let's say you named them Customers-Europe and Customers-USA. You can then configure this property as follows.
"found_dbname=databasename1;found_hostname=*;found_schema=schema1": {
"dbname": "Customers",
"schema": "mssql-schema-name",
"dialect": "mssql",
"collibraSystemName": "Customers-Europe"
},
"found_dbname=databasename2;found_hostname=server-name.onmicrosoft.com;found_schema=schema2": {
"dbname": "Customers",
"schema": "oracle-schema-name",
"dialect": "oracle",
"collibraSystemName": "Customers-USA"
},
|
Yes |
||||||||||
|
This section allows you to specify the Power BI workspaces from which you want to ingest metadata. If you specify a capacity, all of the workspaces in that capacity are also ingested. Workspace filtering takes precedence over capacity filtering, meaning workspaces are filtered first. If there is no explicit exclusion of capacities containing workspaces, all capacities containing workspaces are ingested. Filtering of reports and dashboards is subordinate to workspace filtering, meaning that to include reports and dashboards from a certain workspace, that workspace has be ingested as well. Reports and dashboards from a single workspace cannot be ingested in different domains. Any configured dashboard and report filtering is then taken into consideration. Any meta-characters in the name of a workspace must be enclosed in square brackets "[ ]". For example, a workspace with the name Important If you don't want to specify the Power BI workspaces from which to ingest, you must completely remove this filters section. You can use wildcards to capture multiple connection string combinations:
|
No |
|||||||||||
|
The unique resource ID of the domain (or domains), in Collibra Platform, in which you want to ingest the Power BI assets. You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>. |
Yes |
|||||||||||
|
description
|
Any description, as you see fit. |
No |
||||||||||
|
workspaceNames
|
The names of Power BI workspaces from which you want to ingest metadata. Any meta-characters in the name of a workspace must be enclosed in square brackets "[ ]". For example, a workspace with the name "Sale and Marketing [automobiles]" should be formatted as follows: |
No |
||||||||||
|
workspaceIds
|
The IDs of Power BI workspaces from which you want to ingest metadata. We highly recommend that you read through Filtering Power BI workspaces for important information and guidance before configuring your filters. |
No | ||||||||||
|
capacityNames
|
The names of capacities on which you want to filter. |
No | ||||||||||
|
capacityIds
|
The IDs of capacities on which you want to filter. Important All letters in a capacity ID must be in upper case. |
No | ||||||||||
|
excludeWorkspaceNames
|
The names of Power BI workspaces that you want to exclude from the ingestion job. This is useful if you want to exclude, for example, dedicated development and testing workspaces. The metadata of inactive and personal workspaces is not harvested or uploaded to the Collibra Data Lineage service instance. An inactive workspace is one for which no reports or dashboards have been viewed in the past 60 days. My workspace is the personal workspace for any Power BI customer to work with their own, personal content. For complete details on the advantages, limitations and configuration considerations of this property, see Filtering Power BI workspaces. |
No | ||||||||||
|
excludeWorkspaceIds
|
The IDs of Power BI workspaces that you want to exclude from the ingestion job. This is useful if you want to exclude, for example, dedicated development and testing workspaces. For complete details on the advantages, limitations and configuration considerations of this property, see Filtering Power BI workspaces. |
No |
Filter configuration validation
There are several validation rules to ensure that filter configurations are valid and non-contradictory. Failure to pass validation does not affect the integration; rather warnings are generated and included in analyze errors and the logs.
In the following example, the same workspace is specified for inclusion and exclusion. If this case, the exclusion filter takes precedence, meaning workspace ABC2 is not included.
"workspaceNames": ["ABC2"],
"excludeWorkspaceNames": ["ABC2"]
The following error is the same scenario as in the previous example, except that wildcards are used. The result is the same, meaning workspace ABC2 is not included.
"workspaceNames": ["ABC*"],
"excludeWorkspaceNames": ["ABC2"]
In the following example, a warning is included in an analysis error because workspace ABC2 is specified in multiple filters.
{
"domainId": "<domain-ref-id>",
"description": "FirstFilter",
"workspaceNames": ["ABC2"]
},
{
"domainId": "<domain-ref-id>",
"description": "SecondFilter",
"workspaceNames": ["ABC2", "ABC3"]
}
