Adding a Rule

To add a rule, go to the Rules page. There are two ways to access the Rules page in Collibra DQ:

  • From the left navigation bar.
  • From the findings page.

To access the Rules page from the left navigation bar, click the wrench icon and then Rule Builder. From the Rule Builder page, select a data set and a rule type.

Access Rules from left navigation bar

To access the Rules page from the findings page, open a DQ Job to display the findings page. From the findings page, click Rules in the metadata box in the upper right of the page. The Rule Builder opens. Since you're navigating to the Rule Builder from the findings page directly, you do not have to select a data set. In this case, select a rule type to get started.

Access Rules from the findings page

Instructions

  1. Search for a data set or navigate to the Rule Builder page in the left navigation panel.
    • Rules can only be applied to data sets once a DQ job runs once
  2. Click Load.
    • The schema and any previously saved rules populate.
  3. Select a rule type with the dropdown next to the Type label
  4. Select a rule name
    • If applying a preset rule, the rule name will be auto populated
  5. Input a rule condition
    • Only if applying a simple, freeform sql, stat, or native rule type.
    • Provide a value in the condition/sql/function input field.
    • Keystroke Ctrl+Space provides IntelliSense.
  6. Select Low, Medium or High for scoring severity (optional).
  7. Add any custom DQ dimensions for reporting (optional).
  8. Click submit to save the rule.

Search for a dataset and click Select next to the Type label

‌The rule is measured on the next DQ job run for that particular data set.‌

Rule Types

Rule type Description Example
Simple rules Simple rules are used when you want to filter a condition on a single column in a single table. City = 'Baltimore'
Freeform SQL rules Freeform SQL rules are used when you want to apply a condition across multiple tables/columns and generally when more flexibility or customization is desired. select * from dataset where name = 'Collibra'
Preset rules Preset rules are used for quickly adding strict condition checks. Commonly used conditions are available to add to any data set columns.‌  

All built-in Spark functions are available to use. Visit https://spark.apache.org/docs/2.3.0/api/sql/ for simple and freeform sql rules.‌

Points and Percentage

For every percentage the x condition occurs, deduct y points from the data quality score. If a rule was triggered 10 times out of 100 rows, break records occurred 10% of the time. If you input 1 point for every 1 percent, 10 points would be deducted from the overall score.‌

Creating Your First Rule

Let’s create a simple rule using the below information. The data set name.

  1. Search for “shape_example” and click “Load”
  2. Select “Simple Rule”
  3. Rule Name = lnametest
  4. @shape_example.lname = “hootbeck” (should hit one time day over day).
  5. Points = 1
  6. Percentage = 1
  7. Click “Submit”

creating a simple rule with the rule builder

Once the rule has been submitted please find the below list of rules with the new rule we just defined as shown below.

Using the Rules tab

Rule scores will appear under the Rules tab on the findings page. You can also see more details in the bottom panel of the Rules page under the Rules and Results tabs.

Findings page rule results

successfully saved rule under rules tab

Click the plus icon next to the Rule Name to drill into any available rule. The following table describes the columns when you drill into a rule:

Column Description
dataset The dataset to which the rule applies.
ruleNm The unique name of the rule.
ruleValue The column of your table, view, or schema the rule queries against.
modType

There are four possible values that signify the audit sequence of a rule:

Value Description
i Insert
u Update
d Delete
s Status change
updtTs The timestamp of the last run of a rule.
isActive The binary value of whether a rule is active or not. Active rules have values of 1 and inactive rules have values of 0.
userNm The username of the user who generated the rule.

Rules page rule results (bottom panel)