Example: Configuring a data class based on a regular expression and starting the automatic classification for a column

Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

You want to configure the Email data class you manually created and assigned, so this data class can be assigned automatically by the Unified Data Classification method.

Before you begin

Make sure you know which regular expression you want to use for the data class. For more information and references to useful resources, go to Add a data class.

Steps

  1. Configure the email data class.
    1. On the main toolbar, click Products icon, and then click Stewardship.
    2. Click the Data Classification tab.
    3. Select the email data class row.
      The data class parameters appear in a pane on the right-hand side.
    4. Open the Details section.
    5. Complete the fields as required.
      For information on the fields, go to Add a data class.
      Data class parameterDescription
      Minimum confidence threshold

      We set this value to 80.

      Allow empty values

      We leave this field as the default value (False).

      ExamplesWe add the following examples:
      [email protected] , [email protected] , [email protected]
    6. Open the Classification rules section.
    7. Click Add new rule.
    8. In Type, select the Regular expression option.
      Extra fields appear.
    9. Complete the fields as required.
      For information on the fields, go to Add a data class.
      Data class parameterDescription
      Regular expression

      We add:
      ^([a-zA-Z0-9._%\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,6})$

      DescriptionWe leave this field empty.
    10. Click Save.
      The classification rule for the Email data class is configured.
      The Rule details section appears. If you expand the section, you see the details.

  2. Start the classification.
    1. Navigate to a Column asset with email data.
    2. In the At a Glance section, click Classify.
      The data classification process starts. For more information, go to Automatically classify assets.
      The Email data classification suggestion will be assigned to the Column asset with a confidence percentage. For more information, go to accepting and rejecting data classification suggestions.
    3. Click the Data Profiling tab page.
    4. Click Classify.
      The data classification process starts. For more information, go to Automatically classify assets.
      The Email data classification suggestion will be assigned to the Column asset with a confidence percentage. For more information, go to accepting and rejecting data classification suggestions.

What's Next?

You can now configure additional data classes to be used in the automatic classification for a column, table or schema.