Rules
Collibra DQ takes a strong stance that data should first be profiled, auto-discovered and learned before applying basic rules. This methodology commonly removes thousands of rules that will never need to be written and evolve naturally overtime. However there are still many cases to add a simple rule, complex rule or domain specific rule.
The score on the Rules tab is the rule percentage (% column) multiplied by the number of points (Points column), rounded to a whole number. If a rule with a Percent Scoring Type does not meet or exceed its defined percentage, the score on the Rules tab remains 0.
The following table describes the information available on the Rules tab of the Findings page.
Column | Description | ||||||||
---|---|---|---|---|---|---|---|---|---|
Rule Name | The name of the rule. | ||||||||
Condition |
The rule condition that is defined in the rule. An alert generates when the condition is met. In Pullup mode, the Condition column is derived from the rule’s SQL column. In Pushdown mode:
|
||||||||
Points | The number of points to deduct from the quality score for a given rule break. | ||||||||
Perc | The ratio of the total number of breaking records over the total number of rows. | ||||||||
Breaking Records | The number of rows with records that did not pass the conditions of the rule. If the status of the rule is Exception or Passing, then the value should be 0. | ||||||||
Passing Records | The number of rows with records that passed the conditions of the rule. | ||||||||
Status |
The status of the rule. The following table shows the possible statuses:
|
||||||||
Dimension | The DQ Dimension that is identified in the rule finding, for example, Completeness. | ||||||||
State | Shows if a rule is enabled or disabled. | ||||||||
Status | Allows you to validate or resolve an observation and, when applicable, assign it to a user for further analysis. | ||||||||
Profile |
The user account that is assigned to this rule finding. When the Status is Assigned, a user profile displays in this column. Note When a rule finding is unassigned, the profile column is empty. |
||||||||
Action |
Rule Breaks lets you preview the rule break export file and download either a CSV or JSON file, depending on your processing mode, giving you more control over how you use and share break records.
Note The ability to copy rule breaks is limited to secure Cloud Native deployments of Collibra DQ. You cannot copy rule breaks in Standalone deployments. |
Rule Discovery
Rule Discovery detects the data classes assigned to a selected data category. The Rule Discovery algorithm automatically selects the best match if a column matches two or more data classes. Data class match criteria are determined by percent match and name distance.
Click Rule Discovery, then select an option from the Data Category dropdown menu. Click Run Discovery to assign your selection as a data category and run the discovery job.
Break records in the PostgreSQL Metastore
When a rule returns breaking records, the following query inserts unique records into the PostgreSQL Metastore rule_breaks table based on dataset, run_id, rule_nm, and link_id:
INSERT INTO rule_breaks (dataset, run_id, rule_nm, link_id)
VALUES (:dataset, :runId, :ruleNm, :linkId)
ON CONFLICT (dataset, run_id, rule_nm, link_id)
DO UPDATE SET
dataset = :dataset,
run_id = :runId,
rule_nm = :ruleNm,
link_id = :linkId
WHERE
rule_breaks.dataset = :dataset
AND rule_breaks.run_id = :runId
AND rule_breaks.rule_nm = :ruleNm
AND rule_breaks.link_id = :linkId;
Note Because only unique records are inserted into the rule_breaks table, the number of records in the rule_breaks table might not match the number of breaking records displayed on the Findings page.
Pulse View Preview
The Pulse View gives you a data preview box plot for simple rules that are breaking. You can click any available box to drill into the data of that day.
Note Pulse View Preview is not available for passing rules or runs without data.
Exporting rule break records
There are three options to export the details of your rule break records as .xlsx files:
- Export generates an Excel file with the details from the drill-in.
- Export LinkIds generates an Excel file with the name of the dataset, Run Id, Rule Name, and Link Id, when available.
- Export with Details generates an Excel file with the details from the drill-in and the data preview, when available.
Note The ability to copy rule breaks is limited to secure Cloud Native deployments of Collibra DQ. You cannot copy rule breaks in Standalone deployments.
Limitations
Total records are calculated by adding the total number of breaking and passing records. Therefore, when referencing a secondary dataset in a native rule, the extra rows from the secondary dataset are included in the query results on the Findings page, skewing the total rows calculation and the percentage of breaking records.