Working with Dataset Rules
The Dataset Rules page lists all previously saved rules for a given dataset and provides an overview of their details, such as the definitions of SQL conditions, rule types, and whether or not rules pass validation checks. You can also take several quick actions from this page, such as accessing the Rule Workbench or deleting a rule from a dataset. The ability to manage rules from a single page can shorten the amount of time you spend assessing what is needed to meet your business requirements.
Opening Dataset Rules
The following steps show you how to open the list of applicable rules on your dataset.
- Click the in the sidebar menu, then click Rule Builder. The Dataset Rules page opens.
- On the Dataset Rules page, enter the name of your dataset in the Search for a Dataset searchbar, then select it. The Dataset Rules page displays any existing rules on your dataset.
Viewing rule details
When one or more rules are available on the selected dataset, the table on the Dataset Rules page contains the following detailed information.
Column | Description |
---|---|
Rule Name |
The name of the rule. Hover your cursor over the rule name and click to edit the rule. |
Rule Query | The SQL condition of the rule. |
Type | The type of rule. |
Column | The primary column that the rule queries. |
Repo | The data class or template from which the rule is created. This only applies to custom rules, such as Data Type, Data Class, and Template. |
Valid |
Shows whether the rule passes rule validation. shows that the rule passes validation. shows that the rule does not pass validation. Note If you see , you can click it to view more details about why the rule does not pass validation. |
Active |
Shows whether the rule is active for future runs of the dataset. Click the icon to change the active status of the rule. shows that the rule is active. shows that the rule is inactive. |
Scoring Type | Shows whether the rule uses absolute- or percentage-based scoring. For more information about scoring types, see the Rule Workbench topic. |
Points |
The number of points that Collibra DQ deducts from the data quality score when data breaches the conditions of the rule. You can set this value on the Rule Details modal of the Rule Workbench. If you do not customize this value, then Collibra DQ uses the default value of 1. |
% |
The ratio of the total number of breaking records over the total number of rows. If you do not customize this value on the Rule Details modal, then Collibra DQ uses the default value of 1. |
Category |
The data category that you optionally define on the Workbench. |
Dimension |
The DQ Dimension that you optionally assign to the rule on the Workbench. Note Tagging rules with custom dimensions in the Metastore is not supported. |
Timeout Limit |
The number of minutes that any active rule can take to process before it times out and the tool automatically cancels the job. This limit is useful for keeping problematic rules from consuming too many resources. The maximum timeout limit is determined by the greatest Timeout Limit value of any active rule displayed on the table under the Rules tab. Example If there are 3 active rules associated with a dataset, where Rule 1 has a timeout limit of 20 minutes, Rule 2 has a limit of 30 minutes, and Rule 3 has a limit of 60 minutes, then the dataset has 60 minutes per rule to process. If Rule 3 is inactive, the maximum timeout limit for the dataset is 30 minutes, because Rule 2 has the next highest maximum limit of any active rule. Tip You can increase this value for an individual rule from Rule Details on the Rule Workbench. |
Actions |
Click Actions to edit, delete, rename, and view the history of the rule. Note
|
Adding new rules
Click Add Rule to create new rules to run against your dataset.