Enable and calculate data similarity
To show similar assets, data similarity must be enabled in your environment and similarity scores must be calculated for your data.
Tip Even if data similarity is enabled in your environment, you can specify that you don't want to calculate similarity scores for a data source.
Before you begin
- Data similarity is a cloud-only feature and is not certified for FedRAMP.
- This feature is in Beta testing.
- You are using Edge.
- Data Marketplace is enabled.
Steps
| Step | More details | |
|---|---|---|
| 1 |
In the Service Configuration settings, enable the Calculate Data Similarity (Beta) profiling setting. Data similarity scores can be calculated when you profile a data source via Edge. |
Show how
Depending on your environment, follow this procedure either in the Services Configuration section of the Collibra settings or in Collibra Console: Prerequisites
Steps
|
| 2 |
In the Collibra settings, enable the Data Similarity (Beta) setting for Data Marketplace. If, for the Table asset, some assets have a similarity score higher than 50%, the Similar Data tab is visible in the Data Marketplace asset preview. |
Show how
Before you beginThe Settings landing page is enabled. Required permissionsYou are an administrator in Data Marketplace.Steps
|
| 3 |
Register a data source via Edge and profile the data. Similarity scores are calculated for the profiled Table assets. |
Important
If you don't want to calculate similarity scores for a data source during profiling, you can deactivate the calculation via the profiling capability configuration. In the capability, add the following parameter in the Other section:
|
What's next?
If a data consumer in Data Marketplace opens a Table asset preview, and similar assets are available for this table, the Similar Data tab is shown.