Working with Dataset Rules

The Dataset Rules page lists all previously saved rules for a given dataset and provides an overview of their details, such as the definitions of SQL conditions, rule types, and whether or not rules pass validation checks. You can also take several quick actions from this page, such as accessing the Rule Workbench or deleting a rule from a dataset. The ability to manage rules from a single page can shorten the amount of time you spend assessing what is needed to meet your business requirements.

a list of dataset rules

Opening Dataset Rules

The following steps show you how to open the list of applicable rules on your dataset.

  1. Click the Rules wrench icon in the sidebar menu, then click Rule Builder.
  2. The Dataset Rules page opens.
  3. On the Dataset Rules page, enter the name of your dataset in the Search for a Dataset searchbar, then select it.
  4. The Dataset Rules page displays any existing rules on your dataset.

Viewing rule details

When one or more rules are available on the selected dataset, the table on the Dataset Rules page contains the following detailed information.

Column Description
Rule Name

The name of the rule.

Hover your cursor over the rule name and click to edit the rule.

Rule Query The SQL condition of the rule.
Type The type of rule.
Column The primary column that the rule queries.
Repo The data class or template from which the rule is created. This only applies to custom rules, such as Data Type, Data Class, and Template.
Valid

Shows whether the rule passes rule validation.

valid rule icon shows that the rule passes validation.

invalid rule icon shows that the rule does not pass validation.

Note If you see invalid rule icon, you can click it to view more details about why the rule does not pass validation.

Active

Shows whether the rule is active for future runs of the dataset. Click the icon to change the active status of the rule.

valid rule icon shows that the rule is active.

inactive rule icon shows that the rule is inactive.

Scoring Type Shows whether the rule uses absolute- or percentage-based scoring. For more information about scoring types, see the Rule Workbench topic.
Points

The number of points that Collibra DQ deducts from the data quality score when data breaches the conditions of the rule. You can set this value on the Rule Details modal of the Rule Workbench.

If you do not customize this value, then Collibra DQ uses the default value of 1.

%

The ratio of the total number of breaking records over the total number of rows.

If you do not customize this value on the Rule Details modal, then Collibra DQ uses the default value of 1.

Category

The data category that you optionally define on the Workbench.

Dimension

The DQ Dimension that you optionally assign to the rule on the Workbench.

Note Tagging rules with custom dimensions in the Metastore is not supported.

Timeout Limit

The number of minutes that any active rule can take to process before it times out and the tool automatically cancels the job. This limit is useful for keeping problematic rules from consuming too many resources.

The maximum timeout limit is determined by the greatest Timeout Limit value of any active rule displayed on the table under the Rules tab.

Example If there are 3 active rules associated with a dataset, where Rule 1 has a timeout limit of 20 minutes, Rule 2 has a limit of 30 minutes, and Rule 3 has a limit of 60 minutes, then the dataset has 60 minutes per rule to process. If Rule 3 is inactive, the maximum timeout limit for the dataset is 30 minutes, because Rule 2 has the next highest maximum limit of any active rule.

Tip You can increase this value for an individual rule from Rule Details on the Rule Workbench.

Actions

Click Actions to edit, delete, rename, and view the history of the rule.

Note Updated rule names cannot match the name of an existing rule and must contain only alphanumeric characters without spaces.

Adding new rules

Click Add Rule to create new rules to run against your dataset.