About the Unified Data Classification method
Important Unified Data Classification is in beta testing. Only activate this feature in your Test environments. Don't enable it in Production environments yet because it's not fully ready.
Why do we need a new data classification method?
- Organizations want to create custom data classes that can be used and detected by the automatic data classification process on Edge.
- Running the data classification together with profiling, like the Edge data classification method, isn't aligned with an organization's needs. The data class of a specific column hardly ever changes, whereas the profiling statistics do. In that sense, classification should not be run as often as profiling.
- The Cloud Data Classification Platform can no longer remain available due to issues with error-control in machine learning. With machine learning, it is hard to understand why a column is classified in a specific way and to solve issues for incorrect classifications.
What is the Unified Data Classification method?
The Unified Data Classification method:
- Works via Edge and requires specific setup.
Because the data doesn't leave your organization's network, the automatic data classification process is more secure. The samples used during the automatic data classification process are stored between 24 and 48 hours in the Edge Site cache. They are not transferred to Collibra. - Saves time during the profiling activity.
The classification no longer starts with the profiling activity. You can start a separate classification process for a specific asset with a dedicated Classify button. - Relies on classification rules specified for each data class.
This means that the classification engine no longer relies on machine learning, which will make issues and changes more transparent. This also provides more flexibility and allows for customizations. - Delivers optional out-of-the-box data classes.
This means you decide which out-of-the-box data classes you want to use. It also allows you to adjust the provided data classes to your own needs, like changing the name or changing the classification rules. - Works via a new REST API.
With the new REST API, you can manage data classes and start the classification. - Will replace the current data classification via Edge and data classification via the Cloud Data Classification Platform over time.
Important
- The automatic data classification process is not available in on-premises environments. You can create data classes and manually classify your data.
- Data classes and classifications created via the Unified Data Classification method are separated from the old data classes and classifications. The old data classes and classifications are no longer visible if you enable Unified Data Classification.The opposite is also true.
- When Unified Data Classification becomes generally available, a classification migration process will be put in place. This process will transfer all old data classes and data classifications to the new Unified Data Classification method. Old transferred data classes will need to be updated to include classification rules to work with the new automatic data classification method.
Tip You can follow a training and watch videos via Collibra University.
Beta limitations
In this phase, you cannot:
- Merge data classes.
- Search on data classes and classifications.
- Migrate existing data classes and existing data classifications to the new data classification method.
The global permissions "Classification / Data Classes / Classify" and "Classification > Data Classes > Read" are not enforced yet, meaning you can assign them but they are not yet taken into account. For more information, go to Enable Unified Data Classification.