Configure Cloud Data Classification Platform
When you want to use the Cloud Data Classification Platform in Data Catalog, you first have to configure it.
Depending on your environment, follow this procedure either on the Services Configuration tab of the Collibra settings or in Collibra Console:
Prerequisites
- You have the ADMIN or SUPER role in Collibra Console.
- You have a global role that has the System administration global permission.
- The Services Configuration tab is available in the Collibra settings.
Steps
-
Open the Services Configuration page.
-
On the main toolbar, click
, and then click
Settings.
The Collibra settings page opens. - Click Services Configuration.
- Click Edit configuration.
Open the DGC service settings for editing:- Open Collibra Console.
Collibra Console opens with the Infrastructure page. - In the tab pane, expand an environment to show its services.
- In the tab pane, click the Data Governance Center service of that environment.
- Click Configuration.
- Click Edit configuration.
-
On the main toolbar, click
, and then click
Settings.
- Go to the Data Classification section.
- Enter the required information:
Setting
Description
Machine Learning platform URL
This setting requires the SUPER role.
The address of the machine learning platform that will classify your data. Requester Name
This setting requires the SUPER role.
The unique name to identify the client when using Machine Learning platform. API key
This setting requires the SUPER role.
The API Key to authorize the requester when connecting to the Machine Learning platform. Enable Data Classification
- True: Enable Collibra's data classification technology.
- False (default): Do not use Collibra's data classification technology are not accepted.
- If needed, configure the automatic classification acceptance and rejection.
Setting
Description
True: The automatic acceptance and rejection of data classification suggestions is active.
False (default): Data classification suggestions are not automatically accepted or rejected.
Tip Start by manually accepting and rejecting a suggested data class. Only activate the automatic acceptance and rejection feature if you are comfortable with the data classification results.
Automatic acceptance threshold The percentage from which data classification suggestions must be accepted automatically.
If you set this value to 75, then the classification suggestions with a confidence level of 75% or higher are automatically accepted.
If multiple classification suggestions meet the threshold condition for a column, the classification suggestion with the highest confidence level percentage is accepted automatically if this classification suggestion is the only one to have that confidence level percentage.ExampleYou set the automatic acceptance threshold to 85%. You classify a table with 2 columns.
- For column A, three classification suggestions are possible, one with confidence level 93%, one with 92%, and one with 90%.
- For column B, two classification suggestions are possible. Their confidence level is the same, 86%.
The results of the automatic acceptance will be:
- For column A, the classification suggestion with 93% will be accepted automatically.
- For column B, nothing is done, both suggestions will be visible.
The default acceptance threshold is 90.
Automatic rejection threshold The percentage from which data classification suggestions must be rejected automatically. If you set this value to 49, then all data classification suggestions with a confidence level of 49% or lower are automatically rejected.
The default rejection threshold is 10.
Note If the acceptance threshold and rejection threshold are set to the same value, and a data classification suggestion has this confidence level percentage, the classification suggestion will be rejected.
- Click Save all.