Dupes
The Dupes monitor detects column values that match other existing column values.
The following table shows the available columns in the Dupes tab.
Column | Description |
---|---|
Type | The type of finding, for example, DUPE. |
Score | The percentage that two or more duplicate values match. A score of 100 indicates that the duplicate values are exact matches of each other, whereas a score of 85 indicates that the duplicate values are fuzzy matches. |
[Column] |
The column where the duplicate value appears. Case insensitive exact match duplicates for Pullup datasets display in all lower case. While the casing of exact match duplicates may vary, this is the expected behavior. Note The name of this column is dynamic depending on how it appears in your table, file, or view. For example, when the name of the column in your data source that contains the duplicate values is called last name, then the column in the Findings table will be last name. |
Occurs | The number of duplicate values in a column. |
Profile |
The user account that is assigned to this dupes finding. When the Status is Assigned, a user profile displays in this column. Note When a dupes finding is unassigned, the profile column is empty. |
Status |
Lets you label and train a finding. The available dropdown menu options are Validate, Invalidate, and Resolve. Validate instructs Collibra DQ to either assign a finding to a specific user for review, which then appears in the View the Assignment Queue or acknowledge without an assignee that the finding is a valid observation. Invalidate instructs Collibra DQ to ignore a finding and allow the value to pass. There are two invalidation options:
Tip When you have many findings to invalidate, it may be best to use the Save option to invalidate them at the same time, once all findings are reviewed. Resolve Instructs Collibra DQ to mark the finding as an observation and prevents it from appearing in future runs. Resolving a finding does not immediately affect data quality scores. |
Action |
In Pushdown mode, you can download a CSV or JSON file containing details of the break records. Note This column does not display for DQ Jobs created in Pullup mode. |
Exporting dupes records
Click Export above the drill-in table to generate an Excel file with the details from the drill-in.