Dupes

The Dupes monitor detects column values that match other existing column values.

The following table shows the available columns in the Dupes tab.

Column Description
Type The type of finding, for example, DUPE.
Score The percentage that two or more duplicate values match. A score of 100 indicates that the duplicate values are exact matches of each other, whereas a score of 85 indicates that the duplicate values are fuzzy matches.
[Column]

The column where the duplicate value appears.

Case insensitive exact match duplicates for Pullup datasets display in all lower case. While the casing of exact match duplicates may vary, this is the expected behavior.

Note The name of this column is dynamic depending on how it appears in your table, file, or view. For example, when the name of the column in your data source that contains the duplicate values is called last name, then the column in the Findings table will be last name.

Occurs The number of duplicate values in a column.
Profile

The user account that is assigned to this dupes finding. When the Status is Assigned, a user profile displays in this column.

Note When a dupes finding is unassigned, the profile column is empty.

Status

Lets you label and train a finding. The available dropdown menu options are Validate, Invalidate, and Resolve.

Validate instructs Collibra DQ to either assign a finding to a specific user for review, which then appears in the View the Assignment Queue or acknowledge without an assignee that the finding is a valid observation.

Invalidate instructs Collibra DQ to ignore a finding and allow the value to pass. There are two invalidation options:

  • Save lets you mark a finding as invalidated.
  • Save & Retrain lets you invalidate a finding and any previously saved invalidated findings, if any.
  • Tip When you have many findings to invalidate, it may be best to use the Save option to invalidate them at the same time, once all findings are reviewed.

Resolve Instructs Collibra DQ to mark the finding as an observation and prevents it from appearing in future runs. Resolving a finding does not immediately affect data quality scores.

Action

In Pushdown mode, you can download a CSV or JSON file containing details of the break records.

Note This column does not display for DQ Jobs created in Pullup mode.

Exporting dupes records

Click Export above the drill-in table to generate an Excel file with the details from the drill-in.