Configure Cloud Data Classification Platform

When you want to use the Cloud Data Classification Platform in Data Catalog, you first have to configure it.

Depending on your environment, follow this procedure either on the Services Configuration tab of the Collibra settings or in Collibra Console:

Important Editing the Services Configuration from the Settings page isn't available in the latest UI. If you use the latest UI, you can configure these settings only in Collibra Console. For more information, go to DGC service configuration settings.

Prerequisites

Steps

  1. Open the Services Configuration page.
    1. On the main toolbar, click Products icon, and then click Cogwheel icon Settings.
      The Collibra settings page opens.
    2. Click Services Configuration.
    3. Click Edit configuration.
    Open the DGC service settings for editing:
    1. Open Collibra Console.
      Collibra Console opens with the Infrastructure page.
    2. In the tab pane, expand an environment to show its services.
    3. In the tab pane, click the Data Governance Center service of that environment.
    4. Click Configuration.
    5. Click Edit configuration.
  2. Go to the Data Classification section.
  3. Enter the required information:

    Setting

    Description

    Machine Learning platform URL

    This setting requires the SUPER role.

    The address of the machine learning platform that will classify your data.

    Requester Name

    This setting requires the SUPER role.

    The unique name to identify the client when using Machine Learning platform.

    API key

    This setting requires the SUPER role.

    The API Key to authorize the requester when connecting to the Machine Learning platform.

    Enable Data Classification

    • True: Enable Collibra's data classification technology.
    • False (default): Do not use Collibra's data classification technology are not accepted.
  4. If needed, configure the automatic classification acceptance and rejection.

    Setting

    Description

    Enable automatic classification acceptance and rejection

    True: The automatic acceptance and rejection of data classification suggestions is active.

    False (default): Data classification suggestions are not automatically accepted or rejected.

    Tip Start by manually accepting and rejecting a suggested data class. Only activate the automatic acceptance and rejection feature if you are comfortable with the data classification results.

    Automatic acceptance threshold

    The percentage from which data classification suggestions must be accepted automatically.
    If you set this value to 75, then the classification suggestions with a confidence level of 75% or higher are automatically accepted.

    If multiple classification suggestions meet the threshold condition for a column, the classification suggestion with the highest confidence level percentage is accepted automatically if this classification suggestion is the only one to have that confidence level percentage.

    Example 

    You set the automatic acceptance threshold to 85%. You classify a table with 2 columns.

    • For column A, three classification suggestions are possible, one with confidence level 93%, one with 92%, and one with 90%.
    • For column B, two classification suggestions are possible. Their confidence level is the same, 86%.

    The results of the automatic acceptance will be:

    • For column A, the classification suggestion with 93% will be accepted automatically.
    • For column B, nothing is done, both suggestions will be visible.

    The default acceptance threshold is 90.

    Automatic rejection threshold

    The percentage from which data classification suggestions must be rejected automatically. If you set this value to 49, then all data classification suggestions with a confidence level of 49% or lower are automatically rejected.

    The default rejection threshold is 10.

    Note If the acceptance threshold and rejection threshold are set to the same value, and a data classification suggestion has this confidence level percentage, the classification suggestion will be rejected.

  5. Click Save all.