About the Unified Data Classification method

What is the Unified Data Classification method?

The Unified Data Classification method is a data classification method that:

  • Works via Edge and requires specific setup.
    Because the data doesn't leave your organization's network, the automatic data classification process is more secure. The samples used during the automatic data classification process are temporarily added to the Edge site cache. They are not transferred to Collibra.
  • Saves time during the profiling activity.
    The classification no longer starts with the profiling activity. You can start a separate classification process for a specific asset with a dedicated Classify button.
  • Relies on classification rules specified for each data class.
    This means that the classification engine no longer relies on machine learning, which will make issues and changes more transparent. This also provides more flexibility and allows for customizations.
  • Delivers optional out-of-the-box data classes.
    This means you decide which out-of-the-box data classes you want to use. It also allows you to adjust the provided data classes to your own needs, like changing the name or changing the classification rules.
  • Works via a new REST API.
    With the new REST API, you can manage data classes and start the classification.
  • Will replace the current data classification via Edge and data classification via the Cloud Data Classification Platform over time.
Important 
  • The automatic data classification process is not available in on-premises environments. You can, however, create data classes and manually classify your data.
  • Data classes and classifications created via the Unified Data Classification method are separated from the old data classes and classifications. The old data classes and classifications are no longer visible if you enable Unified Data Classification.The opposite is also true.
  • This method is the default data classification method via Edge for new environments.
    In the next releases, processes will become available to migrate data from old classification methods to this method. Old transferred data classes will need to be updated to include classification rules to work with the new automatic data classification method.

Tip You can follow a training and watch videos via Collibra University.

Limitations

At this moment, you can't:

  • Merge data classes.
  • Migrate existing data classes and existing data classifications to the new data classification method.

Why do we need a new data classification method?

  • Organizations want to create custom data classes that can be used and detected by the automatic data classification process on Edge.
  • Running the data classification together with profiling, like the Edge data classification method, isn't aligned with an organization's needs. The data class of a specific column hardly ever changes, whereas the profiling statistics do. In that sense, classification should not be run as often as profiling.
  • The Cloud Data Classification Platform can no longer remain available due to issues with error-control in machine learning. With machine learning, it is hard to understand why a column is classified in a specific way and to solve issues for incorrect classifications.