About Data Quality & Observability
Important This feature is available only in the latest UI.
Data Quality & Observability helps to ensure only reliable, high-quality data exists across your data landscape. Automatic and custom data quality monitoring equips you with detailed data profiling insights, which, when coupled with instant alerting, allows you to identify and take action on data anomalies as soon as they are observed.
Data Quality & Observability provides a comprehensive set of monitoring options for data engineers, analysts, governance roles, and other technical stewards.
Data Quality & Observability process
The following image and table show the key processes for maintaining high-quality data with Data Quality & Observability.
Process | Task overview | Description |
---|---|---|
Prepare Edge |
|
Data Quality & Observability prioritizes security by relying on Edge, ensuring that all interactions with your data sources occur through Edge. This approach eliminates the need to modify how your databases are exposed to Collibra or globally. Before running data quality, a system administrator connects to data sources, adds the required capabilities, and ensures that users have the necessary roles to use Data Quality & Observability. |
Test your data at the schema-level |
Before jumping into advanced monitoring, leverage quick monitoring for an immediate impression of the health of your data. Quick monitoring creates basic Data Quality Jobs to instantly apply observability to your schema and, based on your preference, to all or some of its tables. Beyond standard data profiling, which includes data type and schema change detection, you can include row count checks and enable descriptive statistics, such as minimum and maximum values, in the User Interface (UI) for a quick impression of your data. While quick monitoring always returns descriptive statistics, Collibra allows you to control whether they are visible in the UI, helping you reduce the risk of exposing potentially sensitive data as you begin to check the health of your data. |
|
Create a Data Quality Job at the table-level |
For in-depth table monitoring, deploy Data Quality Jobs. Building on the initial data profile insights from quick monitoring, table-level Data Quality Jobs provide advanced monitoring capabilities and flexibility to automate their run schedule. You can configure custom SQL queries to perform targeted checks, offering detailed insights tailored to your business needs. Instant notifications about your Data Quality Job runs keep you informed about important changes in your data quality monitoring, allowing you to take immediate action and maintain high-quality, trustworthy data. This powerful process allows you to proactively identify and address potential issues in your data environment. |
|
View the score |
|
Data quality scores are automatically integrated on Column, Table, Schema, and Database asset pages. If you want to see data quality scores on Business and Governance asset pages, a Data Steward can configure custom aggregation paths to those asset pages. This allows you to view and monitor their data quality scores as they evolve. In any case, you can always access related asset pages through direct links on Data Quality Job pages. You can also assign Data Quality Jobs to members of your organization for regular monitoring and closer inspection of potential data quality issues. Assigning responsibilities to a Data Quality Job is a simple way to provide end-to-end governance and strengthen the trustworthiness of your data. |
Supported data sources
Currently, Data Quality & Observability supports the following data sources:
- Athena
- BigQuery
- Databricks
- Oracle OCI
- Redshift
- SAP HANA
- Snowflake
- SQL Server
- Trino