Labeling / Training
Item Labeling
Quickly click findings to trigger retraining.
Action labeling options
The following action labels instruct Collibra on how to handle a finding:
Action | Description |
---|---|
Validate | Instructs Collibra to either assign a finding to a specific user for review, which then appears in the View the Assignment Queue, or acknowledge without an assignee that the finding is a valid observation. Note: Validating a finding does not improve your score. |
Invalidate | Instructs Collibra to ignore a finding and allow the value to pass. There are two invalidation options: Save and Save & Retrain. Save: Allows you to mark a finding as invalidated. Save & Retrain: Allows you to invalidate a finding and any previously saved invalidated findings (if any). Note: When you have many findings to invalidate, it may be best to use the Save option to invalidate them at the same time, once all findings are reviewed. |
Resolve | Instructs Collibra to mark the finding as an observation and prevents it from appearing in future runs. Resolving a finding does not immediately affect data quality scores. |
Available actions by feature
Feature | Available actions |
---|---|
Behaviors | Validate, Resolve |
Rules | Validate, Resolve |
Outliers | Validate, Invalidate, Resolve |
Pattern | Validate, Invalidate, Resolve |
Source | Validate, Invalidate, Resolve |
Record | Validate, Resolve |
Dupes | Validate, Invalidate, Resolve |
Warning Some findings are ineligible for all labeling options. For example, you can only apply Validate and Resolve labels to findings that result from Rules.
Validating a finding
Invalidating a finding
Sometimes the findings page flags issues with your data that DQ discovers during a job run, but maybe you want DQ to ignore certain flagged issues. The invalidate label allows you to do that. After you add a descriptive annotation of your action, you can then select either Save or Save & Retrain.
Save
If you have a large number of findings that DQ has flagged, and you want to invalidate all of them at once instead of clicking through one at a time, select Save for all of the findings you would like to bulk invalidate. On your last finding, select Save & Retrain. All previously saved invalidated findings are removed and DQ retrains your dataset.
Save & Retrain
When you Save & Retrain your dataset, any previously deducted points from a flagged finding are restored and reflected in your overall data quality score. If you do not have many findings to invalidate, you can Save & Retrain individually instead of in bulk.
Resolving a finding
Some features, such as Behavior and Rules, only permit Validate and Resolve actions. When you cannot Validate a finding but you want to apply a label, select Resolve. The Resolve label prevents a finding from appearing in future runs of your dataset, and does not immediately affect your data quality score when applied.
Recalling labeled findings
To modify a previously labeled finding, you can always access them through the Labels tab. Here you can edit an annotation or delete a label entirely. If you delete a label, it returns to the findings page, unlabeled. From there you can again choose to Validate, Invalidate, or Resolve it.
To closely analyze when a finding has received a label, who has applied it, and more, see also the Dataset Audit Trail.