Synchronize Microsoft Fabric
Synchronizing Microsoft Fabric is the process of integrating metadata from Fabric and making the data available in Collibra Platform.
You can synchronize manually or automate the process by adding a synchronization schedule.
Prerequisites
In your Collibra environment:
- You have given the Edge user the required permissions.
- You have created an Azure connection.
- You have added the Fabric synchronization capability to the Azure connection.
- Make sure you are on the latest UI because the Fabric integration is available only in the latest UI.
- You have a resource role with the Configure external system resource permission, for example, Owner.
- You have a global role with the Catalog global permission, for example, Catalog Author.
- You have a global role with the View Edge connections and capabilities global permission, for example, Edge integration engineer. For example, Edge integration engineer.
Steps
-
On the main toolbar, click
→
Catalog.
The Catalog homepage opens. - Click Integrations.
The Integrations page opens. - Click the Integration Configuration tab.
- In the Connection name column, locate the Azure connection that you used when you added the Fabric synchronization capability and click the capability link in the Capabilities column.
The Fabric synchronization capability configuration page opens. - In the Synchronization Configuration section, click the Edit icon.
- Complete the following fields:
Field Action Updated: <timestamp> (Optional) Click Updated: <timestamp> next to Synchronization Configuration, where timestampindicates the last time when the data was loaded from Microsoft Fabric.
The workspace names are loaded to the dropdown list of the Fabric workspace names field. This can take some time.Default community In Default community, select a Collibra community to ingest the metadata. Subdomains per workspace will be automatically created in this community.
Fabric domain In Fabric domain, select a Collibra domain to ingest the Microsoft Tenant and Fabric Capacity assets. Custom tenant name Specify a tenant name to replace the tenant name fetched from the Microsoft API, or leave this field blank to use the one fetched from the API.
How to find the tenant name in Fabric- Sign in to Microsoft Fabric.
- In the top menu, click the avatar to open your user profile.
- In the Profile section, under Tenant Name, copy the tenant name. For example,
collibra.onmicrosoft.com.
Custom capacity name Specify a capacity name to replace the capacity name fetched from the Microsoft API, or leave this field blank to use the ones fetched from the API. You can add the names of multiple Fabric capacities.
To add a custom capacity name:
- Click Add Item.
- In Capacity ID, enter the capacity ID of the Microsoft Fabric capacity.
- In Custom name, enter the name of the Microsoft Fabric capacity.
How to find the capacity name and ID in FabricNote that you may need admin access to the Fabric portal, capacity, or workspace to view the details of a Fabric capacity.
- Sign in to Microsoft Fabric.
- In the left sidebar, click Workspaces.
- In the Workspaces section, select your workspace.
- In the tab bar of your selected workspace, click Workspace settings.
- In the left sidebar of the Workspace settings section, click Workspace type.
- In the Workspace type section:
- Under Details, copy the capacity name in bold. For example,
colfabriccapacity1. - For Capacity ID, copy the value of the capacity ID.
- Under Details, copy the capacity name in bold. For example,
Fabric workspace names Enter the names of specific Microsoft Fabric workspaces, or leave this field blank to ingest metadata from all workspaces available to your service principal.
To specify the workspace, complete the following steps:
- Click Add Workspace.
- In Workspace, enter the name of the workspace you want to ingest metadata from.
- Click Save.
Maximum files per lakehouse Specify the maximum number of files to be ingested per lakehouse. - To ingest all files, set the value to
-1. - If the value is set to
0, no files are ingested.
JDBC connections If you want to allow sampling, profiling, and classification of assets created via the Fabric integration, add the JDBC connection information.
To do so, complete the following steps:
- Click Add Item.
- In Database full name, enter the SQL Server database name in the following format:
microsofttenant>fabriccapacity>fabricworkspace>databasenameExampleCollibra, Inc>colfabriccapacity1>ColEngWorkspace>integrations-test-sql-database-8fef4727-8ea0-42a0-899b-4769219c105d - In JDBC connection, select the JDBC connection that you created for your SQL Server database.
- Click Save.
Note Make sure to add all JDBC connections for the SQL Server databases that you want to integrate.Domain include mappings Optionally, in Domain include mappings, specify the workspaces, warehouses, lakehouses, schemas, tables, or other artifacts from Fabric that you want to integrate. Optionally, also specify the Collibra domains where they need to be added. When include mappings are configured, only matched assets are ingested and everything else is skipped. When no include mappings are configured, all workspaces are ingested into auto-created subdomains under the main domain.
Note that when you include a deeper asset, Collibra creates its parent assets as skeleton assets in their default, auto-created subdomains so the asset tree stays intact. For example, including a single table also creates the schema, parent lakehouse or warehouse, and workspace as skeleton assets in their default subdomains, and ingests the table's columns into the target domain.
To limit the scope of metadata ingestion to specific domains in Collibra, add a domain include mapping:
- Click Add Domain include mapping.
- In Path, add the path to the assets in Fabric for which you want to integrate metadata. A path can be as granular as the following hierarchy: workspace > artifact > schema > table.
- Optionally, in Domain, select the Collibra domain in which you want to integrate the metadata.
Example- Include an entire workspace and all its artifacts:
Sales WorkspacetoSales Generaldomain. - Include a lakehouse and its tables and columns:
Sales Workspace > Customer LakehousetoSales Customer Datadomain. - Include a Fabric database in a separate domain:
Sales Workspace > Operations DBtoSales Operationsdomain. - Include a specific schema and its tables and columns:
Sales Workspace > Customer Lakehouse > dbotoCustomer DBOdomain. - Include a single table and its columns:
Sales Workspace > Customer Lakehouse > dbo > CustomerstoCustomer Reportingdomain.
Domain exclude mappings Optionally, in Domain exclude mappings, add one or more mappings to prevent specific Fabric workspaces or artifacts from being ingested.
Note The exclude mapping has priority over the include mapping.
To exclude specific metadata from being ingested into Collibra, add a domain exclude mapping:
- Click Add Domain exclude mappings.
- In the field, add the path to the assets in Fabric that you want to exclude. A path can be as granular as the following hierarchy: workspace > artifact > schema > table.
Example- Exclude an entire non-production workspace:
Dev Sandbox. - Exclude a single staging lakehouse while still ingesting the rest of the workspace:
Sales Workspace > Staging Lakehouse. - Exclude a specific schema in a warehouse:
Analytics Workspace > Finance Warehouse > raw_staging. - Exclude a single table:
Analytics Workspace > Finance Warehouse > public > test_table. - Include a lakehouse but exclude one of its tables: include
Sales Workspace > Customer Lakehouse, excludeSales Workspace > Customer Lakehouse > dbo > scratch_table. The lakehouse is ingested without the excluded table.
- Click Save.
- Click Synchronize.
A notification indicates the synchronization has started.
-
On the main toolbar, click
→
Catalog.
The Catalog homepage opens. - Click Integrations.
The Integrations page opens. - Click the Integration configuration tab.
- In the Connection name column, locate the Azure connection that you used when you added the Fabric synchronization capability and click the capability link in the Capabilities column.
The synchronization page opens. - In the Synchronization Configuration section, click the Edit icon.
- Complete the following fields:
Field Action Updated: <timestamp> (Optional) Click Updated: <timestamp> next to Synchronization Configuration, where timestampindicates the last time when the data was loaded from Microsoft Fabric.
The workspace names are loaded to the dropdown list of the Fabric workspace names field. This can take some time.Default community In Default community, select a Collibra community to ingest the metadata. Subdomains per workspace will be automatically created in this community.
Fabric domain In Fabric domain, select a Collibra domain to ingest the Microsoft Tenant and Fabric Capacity assets. Custom tenant name Specify a tenant name to replace the tenant name fetched from the Microsoft API, or leave this field blank to use the one fetched from the API.
How to find the tenant name in Fabric- Sign in to Microsoft Fabric.
- In the top menu, click the avatar to open your user profile.
- In the Profile section, under Tenant Name, copy the tenant name. For example,
collibra.onmicrosoft.com.
Custom capacity name Specify a capacity name to replace the capacity name fetched from the Microsoft API, or leave this field blank to use the ones fetched from the API. You can add the names of multiple Fabric capacities.
To add a custom capacity name:
- Click Add Item.
- In Capacity ID, enter the capacity ID of the Microsoft Fabric capacity.
- In Custom name, enter the name of the Microsoft Fabric capacity.
How to find the capacity name and ID in FabricNote that you may need admin access to the Fabric portal, capacity, or workspace to view the details of a Fabric capacity.
- Sign in to Microsoft Fabric.
- In the left sidebar, click Workspaces.
- In the Workspaces section, select your workspace.
- In the tab bar of your selected workspace, click Workspace settings.
- In the left sidebar of the Workspace settings section, click Workspace type.
- In the Workspace type section:
- Under Details, copy the capacity name in bold. For example,
colfabriccapacity1. - For Capacity ID, copy the value of the capacity ID.
- Under Details, copy the capacity name in bold. For example,
Fabric workspace names Enter the names of specific Microsoft Fabric workspaces, or leave this field blank to ingest metadata from all workspaces available to your service principal.
To specify the workspace, complete the following steps:
- Click Add Workspace.
- In Workspace, enter the name of the workspace you want to ingest metadata from.
- Click Save.
Maximum files per lakehouse Specify the maximum number of files to be ingested per lakehouse. - To ingest all files, set the value to
-1. - If the value is set to
0, no files are ingested.
JDBC connections If you want to allow sampling, profiling, and classification of assets created via the Fabric integration, add the JDBC connection information.
To do so, complete the following steps:
- Click Add Item.
- In Database full name, enter the SQL Server database name in the following format:
microsofttenant>fabriccapacity>fabricworkspace>databasenameExampleCollibra, Inc>colfabriccapacity1>ColEngWorkspace>integrations-test-sql-database-8fef4727-8ea0-42a0-899b-4769219c105d - In JDBC connection, select the JDBC connection that you created for your SQL Server database.
- Click Save.
Note Make sure to add all JDBC connections for the SQL Server databases that you want to integrate.Domain include mappings Optionally, in Domain include mappings, specify the workspaces, warehouses, lakehouses, schemas, tables, or other artifacts from Fabric that you want to integrate. Optionally, also specify the Collibra domains where they need to be added. When include mappings are configured, only matched assets are ingested and everything else is skipped. When no include mappings are configured, all workspaces are ingested into auto-created subdomains under the main domain.
Note that when you include a deeper asset, Collibra creates its parent assets as skeleton assets in their default, auto-created subdomains so the asset tree stays intact. For example, including a single table also creates the schema, parent lakehouse or warehouse, and workspace as skeleton assets in their default subdomains, and ingests the table's columns into the target domain.
To limit the scope of metadata ingestion to specific domains in Collibra, add a domain include mapping:
- Click Add Domain include mapping.
- In Path, add the path to the assets in Fabric for which you want to integrate metadata. A path can be as granular as the following hierarchy: workspace > artifact > schema > table.
- Optionally, in Domain, select the Collibra domain in which you want to integrate the metadata.
Example- Include an entire workspace and all its artifacts:
Sales WorkspacetoSales Generaldomain. - Include a lakehouse and its tables and columns:
Sales Workspace > Customer LakehousetoSales Customer Datadomain. - Include a Fabric database in a separate domain:
Sales Workspace > Operations DBtoSales Operationsdomain. - Include a specific schema and its tables and columns:
Sales Workspace > Customer Lakehouse > dbotoCustomer DBOdomain. - Include a single table and its columns:
Sales Workspace > Customer Lakehouse > dbo > CustomerstoCustomer Reportingdomain.
Domain exclude mappings Optionally, in Domain exclude mappings, add one or more mappings to prevent specific Fabric workspaces or artifacts from being ingested.
Note The exclude mapping has priority over the include mapping.
To exclude specific metadata from being ingested into Collibra, add a domain exclude mapping:
- Click Add Domain exclude mappings.
- In the field, add the path to the assets in Fabric that you want to exclude. A path can be as granular as the following hierarchy: workspace > artifact > schema > table.
Example- Exclude an entire non-production workspace:
Dev Sandbox. - Exclude a single staging lakehouse while still ingesting the rest of the workspace:
Sales Workspace > Staging Lakehouse. - Exclude a specific schema in a warehouse:
Analytics Workspace > Finance Warehouse > raw_staging. - Exclude a single table:
Analytics Workspace > Finance Warehouse > public > test_table. - Include a lakehouse but exclude one of its tables: include
Sales Workspace > Customer Lakehouse, excludeSales Workspace > Customer Lakehouse > dbo > scratch_table. The lakehouse is ingested without the excluded table.
- Click Save.
- Click the Add synchronization schedule icon.
- Enter the required information and click Save:
Field Description Repeat The interval when you want to synchronize automatically. The possible values are: Daily, Weekly, Monthly, and Cron expression. CronThe Quartz Cron expression that determines when the synchronization takes place.
This field is only visible if you select
Cron expressionin the Repeat field.EveryThe day on which you want to synchronize, for example, Sunday.
This field is only visible if you select
Weeklyin the Repeat field.Every firstThe day of the month on which you want to synchronize, for example, Tuesday.
This field is only visible if you select
Monthlyin the Repeat field.At
The time at which you want to synchronize automatically, for example, 14:00.
- You can only schedule on the hour. For example, you can add a synchronization schedule at 8:00, but not at 8:45.
- This field is only visible if you select
Daily,Weekly, orMonthlyin the Repeat field.
Time zone The time zone for the schedule.
The synchronization job synchronizes the Microsoft Fabric metadata.
After the synchronization:
- You can view a summary of the results from the Activities list.
- For information on the integrated data, go to Integrated Microsoft Fabric data.