About the rule workbench

Custom rules are monitors that consist of user-defined or AI-suggested SQL queries to verify whether your data adheres to the business requirements of your organization. You can use the rule workbench to write, validate, and apply custom SQL rules to existing Data Quality Jobs. This provides valuable insights on the job details page, allowing you to monitor the health of critical data in your organization.

From the rule workbench, you can:

  • Add custom SQL rule monitors to existing Data Quality Jobs with user-defined or AI-suggested SQL queries.
  • Configure automated notifications to send to users when your rule is passing, breaking, or returning an exception.
  • Preview the results of your rule to ensure it will return the expected output before your job runs.
  • Associate data quality dimensions with your rule to help you group and report on the data quality monitoring.
  • Filter the data from your Data Quality Job to include only the rows most relevant to your rule use case.
  • Apply a tolerance setting to define the percentage of rule breaks required to affect rule statuses and trigger notifications tied to the rule.

The following screenshot and table highlight the various elements of the rule workbench.

screenshot of rule workbench

Element number Element Description
Rule name

The unique name of your custom rule.

Note You cannot use a name that is already in use by another rule in the job. Rule names can contain periods (.), underscores (_), and hyphens (-). All other special characters are not allowed.

Primary column

Allows you to set a primary column from your table using a drop-down list. Collibra can then associate your rule with a column asset based on the selected primary column.

Save

Save your rule as a rule monitor for your Data Quality Job. The rule monitor runs automatically the next time your job runs.

SQL

The SQL tab allows you to enter a custom SQL query and, if needed, the filter query.

The SQL query identifies records that violate its conditions, helping you detect unusual, unexpected, or inaccurate data. When a violation, known as a rule break, occurs, the state of the rule is marked as breaking on the Monitors tab of the Job Details page. If no rule breaks occur, the state is marked as passing.

Note When you run a Data Quality Job with rules that include a rule tolerance percentage, only the rules with percentages of breaking records greater than their tolerance thresholds impact the data quality score.

Settings

The settings tab allows you to associate data quality dimensions with your rule, define a rule tolerance percentage, and add notifications and a description.

SQL editor Allows you to edit your custom SQL. When available, it provides typeahead suggestions for elements of a SQL query, such as operators, table names, and column names.
Filter query

A filter query allows you to narrow the scope of your rule by adding a secondary query definition to limit the total number of rows included in the rule.

A filter query uses a simple query format, such as SYMBOL IS NOT NULL or LOW > 25, and must include only conditions and predicates that appear after the WHERE clause in a SQL statement. For example, a correct filter query for results in the "low" column of a NYSE table would be LOW > 25. Full SQL statements, such as SELECT * FROM PUBLIC.NYSE WHERE LOW > 25, are not allowed in the filter query.

Format Automatically formats the SQL query for better readability.
Validate SQL Validates the syntax of your SQL query.
Schema details The schema details tab provides an overview of the columns of your table. It serves as a helpful reference point when adding your SQL rule and helps you determine the type of rules to create.
Results preview After you enter a valid SQL query in the SQL editor and specify the number of sample rows using the run preview button, you can preview the results of your rule on the rule preview tab. This ensures that the rule will return the expected break record output before the job runs.
Run preview button and row sample selector

The run preview button allows you to preview the results of your rule based on the number sample rows you select from the row sample drop-down list.

What's next