Start automatic classification via the Unified Data Classification method

Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

Important 

To suggest a data class, automatic data classification needs enough data. Columns with very little data may not have a data class suggested.

  • The automatic data classification process needs at least six values that can be checked, to classify a column.
    Example:
    For data class A, you define a regular expression and indicate you don't want to consider empty values.
    If you then classify a column with a lot of null values and five non-null values, the column won't get a data classification suggestion, even if the non-null values match data class A.
  • The automatic data classification process will extract a maximum of 1,000 values from the data source.
    The samples are temporarily added to the Edge site cache. They are not transferred to Collibra. If the Edge Site cache already contains at least 100 samples for this data source, the automatic data classification process will use those samples.

Prerequisites

Start the classification process for one column

  1. Go to the related Column asset.
  2. In the At a Glance sidebar, click Classify.
    The data classification process starts.
    If a data class matches the data in the column, a classification suggestion will be assigned to the Column asset with a confidence percentage.
  3. Click the Data Profiling tab page.
  4. Click the Classify button.
    The data classification process starts.
    If a data class matches the data in the column, a classification suggestion will be assigned to the Column asset with a confidence percentage.

Start the classification process for one or more columns from a Table, Schema, or Database asset

  1. Go to the Table, Schema, or Database asset.
  2. Click ActionsClassify.
    The data classification process starts.
    If a data class matches a column, a data classification suggestion will be assigned to the Column asset with a confidence percentage.
  3. Open the Table asset with the classified columns.
  4. Add the Data Classification column to the table.
    In the Data Classification column, the suggested data classes are shown.
  5. Example of data classification suggestions

    Example of data classification result

What's next?

Accepting or rejecting data classification suggestions.