About data quality monitors

The Monitors tab in the Job Details page allows you to analyze the health of your Data Quality Job. You can view the history of job runs to track the evolution of the data quality score and number of findings per run. This provides a clear view of how your job evolves over time, offering valuable data quality insights to support your business decisions.

Data quality monitors are out-of-the-box or user-defined SQL queries that provide observational insights into the quality and reliability of your data. Each monitor is associated with a default data quality dimension. Data quality dimensions categorize data quality findings to help communicate the types of issues detected.

In order for a job to have monitors, it must have one or more rows of data. If a job runs with zero rows, no monitors are executed and therefore do not appear on the Monitors tab. This is because the first monitor evaluated during a job run is the row count adaptive rule. If no rows are found, the remaining monitors are bypassed as they require rows in the Data Quality Job to function.

The following screenshot and table highlight the various elements of the Monitors tab.

screenshot of monitors tab

Element number Element Description
Run history

Run history shows the data quality score and number of findings from a Data Quality Job run on a given day. You can view these details in a chart or list display.

Tip Hover your pointer over a point on the data quality score line chart to view the exact score and run date, or click the point to open the monitors of any run on its associated date.

Run metadata

The various metadata of a Data Quality Job run.

Run metadata shows:

  • The date and time of a Data Quality Job run.
  • The number of active monitors on a Data Quality Job.
  • The number of breaking monitors identified during the job run.
  • The number of columns in the Data Quality Job.
  • The number of rows in the Data Quality Job.
  • The total amount of time the Data Quality Job took to run.
Monitor details table

A table that shows the output of each monitor. It includes insightful details, such as monitor type, the data quality dimension associated with each monitor, and its status.

For more information, go to the overview of monitor details table.

Monitor details table

The following table provides an overview of the monitors shown on the Monitors tab.

Monitor type Description
Monitor name The name of the monitor.
Column name The name of the column.
Monitor type

The type of monitor that actively checks for changes in a Data Quality Job.

  • Data type: Monitors changes to the inferred data type of a column.
  • Empty range: Monitors changes to the number of empty values across all columns.
  • Execution time: Monitors changes to the time it takes for a Data Quality Job to run.
  • Max value: Monitors changes to the highest value in numeric columns.
  • Mean value: Monitors changes to the average value in numeric columns.
  • Min value: Monitors changes to the lowest value in numeric columns.
  • Null check: Monitors changes to the number of NULL values across all columns.
  • Row count: Monitors changes to the number of rows in a Data Quality Job.
  • Schema change: Monitors schema evolution changes, such as columns that are added, altered, or dropped from a Data Quality Job.
  • Uniqueness: Monitors changes to the number of distinct values in fields across all columns.
Dimensions

The data quality dimension associated with the monitor type.

While you can create custom dimensions, the following list includes the out-of-the-box data quality dimensions.

  • Accuracy: The degree to which data correctly reflects its intended values.
  • Completeness: The degree to which potential data contained in a dataset is reflected. Refers to the percentage of columns that have neither EMPTY nor NULL values.
  • Consistency: The degree to which data contains differing, contradicting, or conflicting entries.
  • Integrity: The validity of data across the relationships and ensures that all data in a database can be traced and connected to other data.
  • Validity: The degree to which data conforms to its defining constraints or conditions, which can include data type, range, or format.
  • Duplication: The degree to which data contains only one record of how an entity is identified. Refers to the cardinality of columns in your dataset.
State

When a monitor is active, the following states are possible.

  • Breaking: Indicates that there are data quality findings for the monitor.
  • Passing: Indicates that there aren't any data quality findings for the monitor.
  • Learning: Indicates that the monitor is an adaptive rule that is still evaluating your data to determine whether there are data quality findings.

When a monitor is not active, the Suppressed state is possible.

  • Suppressed: Indicates that a user has manually instructed Collibra to exclude the findings for the monitor from the Data Quality Job results.
Actions

Depending on the state of the monitor, you can:

  • Open the monitor details chart.
  • Pass the finding when it is breaking.
  • Suppress the monitor when it is passing, breaking, or learning.
  • Activate the monitor when it is suppressed.

Important To train monitor results, you need a global role with the Product Rights > Data Quality global permission or the Data Quality Editor or Data Quality Manager resource role with the Data Quality Job > Train Monitors resource permission.

What's next?