Automatic acceptance and rejection of classification suggestions
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
During the data classification process, we predict the data class for a column. The percentage next to the data class indicates the confidence level of the data classification suggestion.
You can configure thresholds to automatically accept or reject the data classification suggestions.
The automatic classification acceptance and rejection feature applies the following rules:
- The data classification suggestion with the highest confidence score is accepted automatically if the confidence score of the data classification suggestion is equal or above the acceptance threshold and if this suggestion is the only one to have that confidence score.
- The data classification suggestions with a confidence score equal or lower than the rejection threshold are automatically rejected.
- If a data classification suggestion has to be accepted and rejected, it is automatically rejected.
- If you set the automatic acceptance threshold to 75%, then a data classification suggestion with a confidence level of 75% or higher is accepted automatically if that is the highest score and if the suggestion is the only one with that score.
- If you set the automatic rejection threshold to 49%, then a data classification suggestion with a confidence level of 49% or lower is automatically rejected and does not appear for the column.
- Once a data class has been accepted or rejected for a column, the data classification process will no longer suggest that data class if you run the data classification process again. The column, however, is checked for other data classes that may match the data in the column.
- You can use the thresholds with both classification methods, via Cloud Data Classification Platform and via Edge.
Tip Start by manually accepting and rejecting a suggested data class. Only activate the automatic acceptance and rejection feature if you are comfortable with the data classification results.