Example | Configuring a data class based on a list of values and starting automatic classification for a table
You want to create a custom data class for T-shirt sizes. Once that is done, you want to start the classification process for a full table.
Prerequisites
- Make sure you know which values you use in the organization to refer to T-shirt sizes. In this case, consider: XS, M, L, XL, XXL, XXXL, Extra small, Small, Medium, Large, X-Large, XX-Large, XXX-Large, 2XL, 3XL.
For more information, go to Add a data class. - You have the required permissions.
Steps
-
Create and configure data class T-shirt size.
-
On the main toolbar, click
→ Stewardship.
- Go to Data Classification → Data classes.
- Add the data class.
- Click Add.
- Add the Name of the data class. In this case, T-shirt size.
- Press Enter to add the data class.
- Click Create.
The data class has been created and is available in the list.
- Define the data class parameters.
- Hover over the new class row and click Preview.
The data class parameters appear in a pane on the right-hand side. - Optionally, add a description by clicking the Description field, typing the description, and saving it.
- Open the Details section.
- Complete the fields as required.
For information on the fields, go to Configuring data classes.Data class parameter Description Minimum confidence threshold Set this value to 80.
Include empty values
Leave this field as the default value (No).
Examples Add Small, L - Open the Classification rules section.
- Click Add new rule.
- In the Type list, select List of values for data.
Extra fields appear. - Complete the fields as required.
For information on the fields, go to Configuring data classes.Data class parameter Description Values Add the following list. Each value must start on a new line.
XSSMLXLXXLXXXLextra smallsmallmediumlargeX-largeXX-largeXXX-large2XL3XLDescription Leave this field empty. - Click Save.
The classification rule for the data class is configured.
Expand the Classification rules section to view the details.
- Hover over the new class row and click Preview.
-
On the main toolbar, click
- Start the automatic classification.
- Open a Table asset.
- Select Actions → Classify.
The data classification process starts. For more information, go to Automatically classify assets
If a data class matches a column in the Table asset, a data classification suggestion is assigned to the Column asset with a confidence percentage. For more information, go to accepting and rejecting data classification suggestions.NoteThe values are not case-sensitive, the value “small” in the list will also be a match with the values “Small” and “SMALL”.
Example:
A column contains the valuespetite,s,L,xl,XL,unknown,unknown, andno size. After the automatic data classification, the column is classified as a T-shirt size with a confidence score of 50% because half of the values in the column are part of the list of values.
Note that the character case didn’t affect the result.
You can also add an extra classification rule to an existing data class.