Release Notes
-
Failure to upgrade to the most recent release of the Collibra Service and/or Software may adversely impact the security, reliability, availability, integrity, performance or support (including Collibra’s ability to meet its service levels) of the Service and/or Software. For more information, read our Collibra supported versions policy.
- Some items included in this release may require an additional cost. Please contact your Collibra representative or Customer Success Manager with any questions.
- 2025.02 (upcoming)
- 2025.01 (upcoming)
- 2024.11 (latest)
- 2024.10
- 2024.09
Release 2025.02
Release Information
- Expected release date of Collibra Data Quality & Observability 2025.02: February 24, 2025
- Release notes publication date: January 20, 2024
Announcement
As a security measure, we are announcing the end of life of the Java 8 and 11 versions of Collibra Data Quality & Observability, effective in the August 2025 (2025.08) release.
In this release (2025.02) and the March (2025.03) release, Collibra Data Quality & Observability is only available on Java 17 and Spark 3.5.3. Depending on your installation of Collibra Data Quality & Observability, you can expect the following in this release:
- Kubernetes installations: Kubernetes containers automatically contain Java 17 and Spark 3.5.3. You may need to update your custom drivers and SAML keystore to maintain compatibility with Java 17.
- Standalone installations: You must upgrade to Java 17 and Spark 3.5.3 to install Collibra Data Quality & Observability 2025.02. We encourage you to migrate to a Kubernetes installation, to improve the scalability and ease of future maintenance.
The April 2025 (2025.04) release will contain Java 8, 11, and 17 versions of Collibra Data Quality & Observability and will be the final release to contain new features on Java 8 and 11. Between 2025.05 and 2025.07, only critical and high-priority bug fixes will be included in the Java 8 and 11 versions of Collibra Data Quality & Observability.
Additional details on driver compatibility, SAML upgrade procedures, and more will be available closer to the release date of the 2025.02 release.
For more information, visit the Collibra Data Quality & Observability Java Upgrade FAQ.
Enhancements
Platform
- When a non-local user enables assignment alert notifications and enters an email address on their user profile, findings assigned to that email address now receive alert notifications.
Connections
- You can now limit the schemas that are displayed in a JDBC connection in Explorer. This enhancement helps you manage usage and maintain security by restricting access to only the necessary schemas. (idea #CDQ-I-152)
Note When you include a restricted schema in the query of a DQ Job, the query scope may be overwritten when the job runs. While only the schemas you selected when you set up the connection are shown in the Explorer menu, users are not restricted from running SQL queries on any schema from the data source.
Jobs
- The Run Date and End Run Date options from the Job tab on the Findings page are now available for Pullup jobs. This allows you to edit run dates from the UI without having to manually edit the command line. The End Run Date option is available only when your Job includes a 24-hour timeslice or a start and end date range timeslice. (idea #CDQ-I-71)
- DQ Jobs on Trino Pushdown connections now allow you to select multiple link ID columns when setting up the dupes monitor.
Rules
- We are delighted to announce that rule filtering is now generally available. To use rule filtering, an admin needs to set RULE_FILTERING to TRUE in the Application Configuration.
- We are also pleased to announce that rule tolerance is generally available.
- You can now define a DQ Dimension when you create a new template or edit an existing one to be applied to all new custom rules created using this template. Additionally, a Dimension column is now shown on the Templates page.
- There is now a dedicated break_msg column in the rule_output Metastore table, which shows the break message when a rule break occurs.
Note If you previously had RULE_FILTER set to TRUE in the Application Configuration during the beta, you need to set it back to TRUE after installing Collibra Data Quality & Observability 2025.02.
Dataset Manager
- You can now add meta tags up to 100 characters long from the Edit option under the Actions menu on the Dataset Manager. (idea #CDQ-I-74)
Fixes
Platform
- We have improved the security of our application.
Jobs
- When a scheduled job runs, only the time it is scheduled to run displays in the Scheduled Time column on the Jobs page.
- When you edit and re-run a DQ Job with a source-to-target mapping of data from two different data sources, the
-srcds
parameter on the command line now correctly contains thesrc_
prefix before the source dataset, and the DQ Job re-runs successfully. - When you include a custom column via the source query of a source-to-target mapping, the validate overlay function no longer fails with an error message.
- When you clone a dataset from Dataset Manager, Explorer, or the Job tab on Findings, the Job Name field and the Command Line in the Review step of Explorer now correctly reflect the "temp_cloned_dataset" naming convention for cloned datasets.
- DQ Jobs with the same run date can no longer run concurrently.
Rules
- When you use Freeform SQL queries with
LIKE %
conditions, for example,SELECT * FROM @public.test where testtext LIKE '%a, c%'
, they now return the expected results. - SQL Server job queries where the table name is escaped with brackets, for example,
select * from dbo.[Table]
, now process correctly when the job runs. - The Rule Results Preview button is no longer disabled when the API call to gather the Livy status fails due to an invalid or non-existent session. The API call now correctly manages the cache for Livy sessions terminated due to idle timeout.
- The contents of the Export Rules with Details now include data from the new Tolerance rule setting.
- Data type rules now evaluate the contents of each column, not just the data type of each column to ensure the correct breaks are returned.
- SQL reserved keywords included in string parsing are now correctly preserved in their original case.
- When you update the name of a rule and an alert is configured on it, the alert will now show the updated name when sent.
Alerts
- When you set a breaking rule as passing, a rule status alert for rules with breaking statuses no longer sends erroneous error messages for that rule.
Connections
- When you substitute a PWD value as a sensitive variable on the Variables tab of a Databricks connection template, the sensitive variable in the connection URL is now set correctly for source-to-target mappings where Databricks is the source dataset.
APIs
- The /v3/jobs/<jobId>/breaks/rules endpoint no longer returns a 500 error when using a valid jobId. Instead, it now returns empty files when no results are found for exports without findings.
- When you schedule a Job to run monthly or quarterly, the
jobSchedule
object in the /v3/datasetDefs endpoint now reflects your selection. - A 404 error response no longer returns when Archive Rules Break Records is enabled and you run the /v3/rules/{dataset}/{ruleName}/{runId}/breaks endpoint.
- Meta tags are now correctly applied to new datasets that are created with the /v3/datasetDefs endpoint.
- When an integration import job fails, the /v3/dgc/integrations/jobs endpoint now returns the correct “failed” status. Additionally, the integration job status “ignored” is now available.
Integration
- An invalid RunID no longer returns a successful response when using a Pushdown dataset with the /v3/jobs/run endpoint.
Release 2025.01
Release Information
- Expected release date of Collibra Data Quality & Observability 2025.01: January 27, 2025
- Release notes publication date: December 31, 2024
Announcement
As a security measure, we are announcing the end of life of the Java 8 and 11 versions of Collibra Data Quality & Observability, effective in the August 2025 (2025.08) release.
In the February 2025 (2025.02) release, Collibra Data Quality & Observability will only be available on Java 17 and Spark 3.5.3. Depending on your installation of Collibra Data Quality & Observability, you can expect the following in the 2025.02 release:
- Kubernetes installations: Kubernetes containers will automatically contain Java 17 and Spark 3.5.3. You may need to update your custom drivers and SAML keystore to maintain compatibility with Java 17.
- Standalone installations: You must upgrade to Java 17 and Spark 3.5.3 to install Collibra Data Quality & Observability 2025.02. Additional upgrade guidance will be provided upon the release date. We encourage you to migrate to a Kubernetes installation, to improve the scalability and ease of future maintenance.
The April 2025 (2025.04) release will have Java 8, 11, and 17 versions of Collibra Data Quality & Observability and will be the final release to contain new features on Java 8 and 11. Between 2025.05 and 2025.07, only critical and high-priority bug fixes will be included in the Java 8 and 11 versions of Collibra Data Quality & Observability.
Additional details on driver compatibility, SAML upgrade procedures, and more will be available alongside the 2025.02 release.
For more information, visit the Collibra Data Quality & Observability Java Upgrade FAQ.
Enhancements
Platform
- When you access Swagger from the Admin Console or the in the upper right corner of any Collibra Data Quality & Observability page, it now opens in a separate tab or window.
Connections
- You can now authenticate Amazon S3 connections using OAuth2 with an Okta principal as a service account. This enhancement simplifies the authentication process and improves security.
- You can now use commas and equals characters in values in the Values option on the Variables tab. For example, val1=,val2 is now a supported variable value.
- Additionally, the Driver Properties on the Properties tab now use a JSON array format instead of a comma-separated string. For example, [{"name":"prop1","value":"val1"},{"name":"prop2","value":"val2"}] is now the supported format for properties.
Jobs
- You can now use multiple link ID columns to identify duplicate records in Pullup DQ Jobs.
- Snowflake Pushdown connections now support source-to-target validation checks for datasets from the same Snowflake Pushdown connection.
- Snowflake Pushdown connections now support all regex special characters in column names except for square brackets [ ].
Rules
- When you hover over the break percentage (Perc) value on the Rules tab of the Findings page, a new popover message now shows the full value. This update improves visibility and makes it easier to view detailed information.
- Tooltips are now available for the Points and Perc columns on the Rules tab of the Findings page.
- You now see a descriptive error message if your data class rule fails to save because of missing required fields or a duplicate name when using the data class rule creator.
- You no longer need to rerun a DQ Job after renaming a rule for its new name to be reflected across all relevant pages.
APIs
- The POST /v3/datasetdefs endpoint now supports creating outlier and/or pattern records directly by including the outliers or patterns element in the JSON payload (dataset name is required).
- If an ID is not provided, a new record is created.
- If an ID is provided and matches an existing record, the existing record will be updated.
- Alternatively, you can use the existing /v2/upsertPatternOpt and /v2/upsertOutlierOpt endpoints to create records and then reference their IDs when you use the PUT /v3/datasetdefs endpoints for updates.
- The ControllerAdmin API now requires ROLE_ADMIN or ROLE_ADMIN_VIEWER to use the endpoints:
- /gettotalcost
- /gettotalbytes
- The ControllerSecurity API now requires ROLE_ADMIN, ROLE_USER_MANAGER, or ROLE_OWL_ROLE_MANAGER to use the endpoints:
- /getalldbuserdetails
- /getdbuserwithroles
- getAllAdGroupsandRoles
- The ControllerUsageStats API now requires ROLE_ADMIN or ROLE_ADMIN_VIEWER to use the endpoints:
- /getuserprofilecount
- /getcolumnscount
- /getusercount
- /getdatasetcount
- /getalertcount
- /getemailcount
- /getfailingdataset
- /getpassingdataset
- /getowlcheckcount
- /getowlusagelast30days
- /getowlusagemonthly
- /getrulecount
- /gettotalowlchecks
- We deprecated the GET /deleteappconfig. As an alternative, you can use DELETE /deleteappconfig instead.
- We deprecated the GET /deleteadminconfig. As an alternative, you can use DELETE /deleteadminconfig instead.
Fixes
Jobs
- You can again preview columns on SAP HANA tables with periods (.) in their names.
- When you run DQ Jobs from the command line, the command
-srcinferschemaoff
, which prevents Collibra Data Quality & Observability from inferring the schema in the validate source activity, now works as expected when files are in the source. - When DB_VIEWS_ON is set to TRUE on the Application Configuration page of the Admin Console, the Include Views option is now also applied when using the Mapping step for a Database connection.
Findings
- The data preview of DQ Jobs created on file data sources that use pipe delimiters (|) now displays correctly on the Profile and Shapes tab of the Findings page.
- DQ Jobs with the validate source layer enabled no longer return an error when you open their Findings pages.
- When you edit the scheduled run time of a DQ Job from the Findings page, the updated scheduled time now saves correctly.
- We removed the sort icon for the Breaking Records and Passing Records columns on the Rules tab of the Findings page, because it was not functional.
- When you click the Run ID link of a dataset on the Assignments Queue, the Findings page now opens to the finding type associated with the Run ID. For example, a Rule finding links to the Rules tab on the Findings page.
Profile
- When you change an Adaptive Rule from automatic to manual scoring on the Profile page, the points value now saves correctly.
- When you manually set the lower and upper bounds of an Adaptive Rule, you can now use decimal values, such as 10.8.
- When you use the search function within the Profile page to search for a partial string of a column name, such as "an" to return results for the columns "exchange" and "company," the correct number of results are now returned. Previously, a maximum of 10 results were returned.
- When a column contains special characters, the column stats section on the Profile page now formats the results correctly, ensuring better readability.
Rules
- When you use the Archive Break Records feature and the link ID column is an alias column, Collibra Data Quality & Observability now properly identifies the link ID column to ensure the rule processes correctly.
- Complex DQ Job query joins containing SAP HANA reserved words in SQL queries are now supported. This ensures that the query compilation processes successfully. The following SAP HANA reserved words are supported in SQL queries:
"union", "all", "case", "when", "then", "else", "end", "in", "current_date", "current_timestamp", "current_time", "weekday", "add_days", "add_months", "add_years", "days_between", "to_timestamp", "quarter", "year", "month", "to_char"
- User-defined rules against DQ Jobs on Snowflake Pushdown connections no longer return a syntax error when "
@
" is present anywhere in the query. - A rule on a DQ Job with Archive Breaking Records enabled now processes successfully without returning an exception message, even when it does not reference all link IDs in its
SELECT
statement. - We temporarily removed the ability to view rule results in fullscreen on the Findings page to address an issue where rule results were missing in fullscreen. The fullscreen view will be re-added in the upcoming Collibra Data Quality & Observability 2025.03 release.
- Int Check rules using string column types no longer flag valid integers as breaking.
Note When displaying preview break records, if a column in the dataset has an inferred data type that differs from the defined type, the data type check is performed based on the defined type rather than the inferred type.
- Break records are no longer shown for rules with an exception status. Additionally, an error message now provides details about the exception.
Dataset Manager
- When you rename a dataset from the Dataset Manager, you can now use a name that exists in a different tenant but not in your current one.
- Datasets with spaces in their names no longer lose their rules when you rename them from the Dataset Manager.
Reports
- The time zone configurations of certain Collibra Data Quality & Observability environments no longer prevent you from properly viewing the Dataset Findings Report.
- When you use the search filter on the DQ Check Summary Report, you no longer encounter an error when you reach records for Check Type - Rule.
Connections
- The CDATA driver is now supported for Athena Pushdown connections.
Admin Console
- The Security Audit Trail now correctly shows the username of the user who deleted a dataset.
APIs
- The /v3/jobs/<jobId>/breaks/rules endpoint no longer returns a 500 error when using a valid jobId. Instead, it now returns empty files when no results are found for exports without findings.
- The /v3/jobs/{jobID}/findings and /v3/jobs/{dataset}/{runDate}/findings endpoints now return the correct value for the passFail parameter.
Integration
- When you integrate a dataset into Collibra Data Intelligence Platform, then add extra columns to it and re-run the integration, the exported JSON file now correctly shows the additional columns.
- The score tile on the Quality tab of Data Quality Rule and Data Quality Job assets now works as expected.
- Dates on the Overview - History table and chart on the Quality tab are now arranged in chronological order.
- The tooltip for Last Run Status on the Summary tab of Data Quality Job assets now contains the correct message.
Beta features
Rules
- When you edit an existing rule from the Rules tab of the Findings page, you can now edit the tolerance value.
- Rule Tolerance is now available for Pushdown connections.
- Rule filter queries applied to rules on Pullup DQ Job now support secondary datasets. Filter queries on
@t1
rules are not supported at this time.
Release 2024.11
Release Information
- Release dates of Collibra Data Quality & Observability:
- November 25, 2024: Collibra Data Quality & Observability 2024.11
- December 11, 2024: Collibra Data Quality & Observability 2024.11.1
- Release notes publication date: October 31, 2024
Announcement
As a security measure, we are announcing the end of life of the Java 8 and 11 versions of Collibra Data Quality & Observability, effective in the August 2025 (2025.08) release.
In the February 2025 (2025.02) release, Collibra Data Quality & Observability will only be available on Java 17. Depending on your installation of Collibra Data Quality & Observability, you can expect the following in the 2025.02 release:
- Kubernetes installations: Kubernetes containers will automatically contain Java 17. You may need to update your custom drivers and SAML keystore to maintain compatibility with Java 17.
- Standalone installations: You must upgrade to Java 17 to install Collibra Data Quality & Observability 2025.02. Additional upgrade guidance will be provided upon the release date. We encourage you to migrate to a Kubernetes installation, to improve the scalability and ease of future maintenance.
The March 2025 (2025.03) release will have Java 8 and 11 versions of Collibra Data Quality & Observability and will be the final release to contain new features on those Java versions. Between 2025.04 and 2025.07, only critical and high-priority bug fixes will be included in the Java 8 and 11 versions of Collibra Data Quality & Observability.
Additional details on driver compatibility, SAML upgrade procedures, and more will be available alongside the 2025.02 release.
For more information, visit the Collibra Data Quality & Observability Java Upgrade FAQ.
Enhancements
Platform
- When querying the rule_output table in the Metastore, the rows_breaking and total_count columns now populate the correct values for each assignment_id. When a rule filter is used, the total_count column reflects the filtered number of total rows.
Integration
- We automated connection mapping by introducing:
- Automapping of schemas, tables, and columns.
- The ability to view table statistics for troubleshooting unmapped or partially mapped connections.
- The Quality tab is now hidden when an aggregation path is not available. (idea #DCC-I-3252)
Pushdown
- Snowflake Pushdown connections now support source to target analysis for datasets from the same Snowflake Pushdown connection.
- You can now monitor advanced data quality layers for SAP HANA Pushdown connections, including categorical and numerical outliers and records.
- Trino Pushdown connections now support multiple link IDs for dupes scans.
Connections
- We now provide out of the box support for Cassandra and Denodo data source connections. You can authenticate both connection types with the basic username and password combination and password manager method.
- You can now authenticate SQL Server connections with NTLM.
- We upgraded the Snowflake JDBC driver to 3.20.0.
Jobs
- You can now set a new variable,
-rdAdj
, in the command line to dynamically calculate and substitute the run date for the-rd
variable at the run time of your DQ Job. - The metadata bar now displays the schema and table name.
Findings
- If you assign multiple Link IDs in a Dupes configuration, each Link ID is now present in the break record preview.
- When there are rule findings, the Breaking Records column on the Rules tab displays the number of rows that do not pass the conditions of a rule. In the Metastore, the values from the Breaking Records column are included in the rows_breaking column of the rule_output table. However, after initially upgrading to 2024.11, values in the rows_breaking column remain
[NULL]
until you re-run your DQ Job.
Important To include data from the rows_breaking column in a dashboard or report, you first need to re-run your DQ Job to populate the column with data.
Alerts
- There are now 8 new variables that allow you to create condition alerts for the scores of findings that meet their criteria. These condition variables include:
- behaviorscore
- outlierscore
- patternscore
- sourcescore
- recordscore
- schemascore
- dupescore
- shapescore
- Job failure alerts now send when a DQ Job fails in the Staged or Initiation activities.
Example To create an alert for shapes scores above 25, you can set the condition to shapescore > 25.
Dataset Manager
- You can now edit and clone DQ Jobs from the Actions button in the far right column on the Dataset Manager.
Fixes
Integration
- Data Quality Job assets now display a “No data quality score available” message when an invalid rule is selected.
- When Collibra Data Quality & Observability cannot retrieve the columns from a table or view during the column mapping process, the column UUIDs in Collibra Data Intelligence Platform are now used by default.
Pushdown
- You can now run Pushdown Jobs using OAuth Tokens generated by the /v3/auth/Oauth/signin endpoint.
- Unique adaptive rules for Pushdown Jobs with columns that contain null values no longer fail when a scheduled run occurs.
- When turning behavioral scoring off in the JSON definition of DQ Job created on Pushdown connections, behavior scores are no longer displayed.
- When DQ Job created on Pushdown connections with Archive Break Records enabled run, references to link IDs in the rule query are now checked and added automatically if they are missing. This also allows you to add your own
CONCAT()
when using complex rules. - We improved the performance of DQ Jobs created on Snowflake Pushdown connections that use
LIMIT 1
for data type queries.
Connections
- We fixed a critical issue that prevented DQ Jobs on temp files from running because of a missing temp file bucket error.
Jobs
- Backrun DQ Jobs are now included in the Stage 3 Job Logs.
- Data Preview now works correctly when the source in the Mapping (source to target) activity is a remote file storage connection, such as Amazon S3.
- DQ Jobs on Oracle datasets now run without errors when Parallel JDBC is enabled.
- When using Dataset Overview to query an Oracle dataset, you no longer receive a generic "Error occurred. Please try again." error message when the source data contains a column with a "TIMESTAMP" data type.
- When including any combination of the conformity options (Min, Mean, or Max) from the Adaptive Rules tab, the column of reference on the Shapes tab is no longer incorrectly marked “N/A” instead of “Auto.”
- Shapes can now be detected after enabling additional Adaptive Rules beyond the default Adaptive Rules settings for file-based DQ Jobs.
- After setting up a source to target mapping in the Mapping step of Explorer where both source and target are temp files, you no longer encounter a “Leave this Mapping” message when you click one of the arrow on the right side of the page to proceed to the next step.
Findings
- After suppressing a behavior score for a dataset that you then use to create a scorecard, the scorecard and Findings page now reflect the same score.
- After suppressing a behavior score and the total score is over 100, the new score is calculated correctly.
Rules
- Rules with special characters in the link ID column now load successfully in the Rule Breaks preview.
- When changing a rule type from a non-Native to Native rule, the Livy banner no longer displays and the Run Result Preview button is enabled. When changing any rule type to any other rule type that is non-Native, Livy checks run and the appropriate banner displays or the Run Result Preview button is enabled.
Alerts
- When a single rule is passing after adding 3 distinct alerts for each Rule Status trigger (Breaking, Exception, and Passing) and one alert with all 3, unexpected alerts no longer send when the DQ Job runs.
- Batch alerts now use the same alerts queue to process as all other alert emails.
APIs
- The /v2/getdatapreview API is now crossed out and marked as deprecated in Swagger. While this API is now deprecated, it continues to function to allow backward compatibility and functionality in legacy workflows.
- The Swagger UI response array now includes the 204 status code, which means that a request has been successfully completed, but no response payload body will be present.
Latest UI
- When using Dataset Overview to query an Oracle dataset, you no longer receive a generic "Error occurred. Please try again." error message when the source data contains a column with a "TIMESTAMP" data type.
- The Adaptive Rules modal on the Findings page now allows you to filter the results to display only Adaptive or Manual Rules or display both.
- We re-added the ability to expand the results portion of the Findings page to full screen.
- There is now an enhanced warning message when you create an invalid Distribution Rules from the Profile page.
- The Select Rows step of Explorer now has a tooltip next to the Standard View option to explain why it is not always available.
- The Actions button on the Dataset Manager now includes options to edit and clone DQ Jobs.
- The Rule Details dialog now has a tooltip next to the "Score" buttons to explain the downscoring options.
- We consolidated the individual login buttons on the Tenant Manager page to a single button that returns you to the main login page.
- Table headers in exported Job Logs generated from the Jobs page now display correctly.
Beta features
Rules
- You can now apply a rule tolerance value to indicate the threshold above which your rule breaks require the most urgent attention. Because alerts associated with rules can generate many alert notifications, this helps to declutter your inbox and allows you to focus on the rule breaks that matter most to you.
- Rule filtering is now available for Pushdown DQ Jobs.
Maintenance Updates
- We added a new check on the flyway library to resolve issues upon upgrade to Collibra Data Quality & Observability 2024.10.
- Denodo connections now support OAuth2 authentication.
- You can now Configure AWS passwordless authentication using Amazon RDS PostgreSQL as the Metastore using Amazon RDS PostgreSQL as the Metastore.
Note AWS passwordless authentication is currently only supported for EC2 Instance Profile-based authentication with an Amazon RDS Metastore for Collibra Data Quality & Observability standalone and cluster-based deployments. IAM pod role-based authentication support will be available in a future release.
Release 2024.10
Release Information
- Release date of Collibra Data Quality & Observability 2024.10: October 29, 2024
- Release notes publication date: September 23, 2024
Warning
Some customers have encountered issues while upgrading to Collibra Data Quality & Observability 2024.10 due to a change in our flyway library that is not backwards compatible. The fix for this issue is included in the Collibra Data Quality & Observability 2024.11.1 patch. As always, we recommend backing up and restoring your Metastore before upgrading Collibra Data Quality & Observability versions.
Note The above issue only impacts upgrades to Collibra Data Quality & Observability 2024.10. New installations will not encounter this issue.
Enhancements
Pushdown
- SAP HANA Pushdown is now generally available.
- When creating and running DQ Jobs on SQL Server Pushdown connections, you can now perform schema, profile, and rules checks.
- You can now scan for fuzzy match duplicates in DQ Jobs created on BigQuery Pushdown connections.
- You can now scan for numerical outliers in DQ Jobs created on Trino Pushdown connections.
- DQ Jobs created on Snowflake Pushdown connections now support union lookback for advanced outlier configurations.
- DQ Jobs created on Snowflake Pushdown connections now support source to target validation to ensure data moves consistently through your data pipeline and identify changes when they occur.
Integration
- You can now define custom integration data quality rules, which are also known as aggregation paths, in the Collibra Data Intelligence Platform operating model setting to allow you to view data quality scores for assets other than databases, schemas, tables, and columns.
- To allow you to manage the scope of the DGC resources to which OAuth can grant access during the integration, a new OAuth parameter in the Web ConfigMap is now set to
DQ_DGC_OAUTH_DEFAULT_REQUEST_SCOPE: "dgc.global-manage-all-resources"
by default. This configuration grants Collibra Data Quality & Observability access via OAuth to all DGC resources during the integration. For more granular control over the DGC resources to which Collibra Data Quality & Observability is granted access via OAuth, we plan to introduce additional allowable values in a future release. - Users of the Quality tab in Collibra Data Intelligence Platform who do not have a Collibra Data Quality & Observability account can now view the history table to track the evolution of the quality score of a given asset.
- When using Replay to run DQ Jobs over a defined historical period, for example 5 days in the past, the metrics from each backrun DQ Job is included in the DQ History table and the quality calculation.
- After integrating a dataset from Collibra Data Quality & Observability and Collibra Data Intelligence Platform, you can now see the number of passing and failing rows for a given rule on the Data Quality Rule asset page.
- The JSON of a dataset integration between Collibra Data Quality & Observability and Collibra Data Intelligence Platform now shows the number of passing and breaking records.
Important You need a minimum of Collibra Data Intelligence Platform 2024.10 and Collibra Data Quality & Observability 2024.07.
Jobs
- You can now edit the schedule details of DQ Jobs from the Jobs Schedule page.
- A banner now appears when a data type is not supported.
Rules
- Data Class Rules now have a maximum character length of 64 characters for the Name option and 256 for the Description.
- Email Data Classes where the email contains a single character domain name now pass validation. For example, [email protected]
- The Rule Definitions page now has a Break Record Preview Available column to make it easier to see when a rule is eligible for previewing break records.
- You can now use the search field to search for Rule Descriptions on the Rule Definitions page.
Alerts
- You can now toggle individual alerts from the Active column on the Alert Builder page to improve control over when you want alerts to send. This can prevent unnecessary alerts from being sent during certain occasions, such as setup and debugging.
Dataset Manager
- The Dataset Manager table now contains a searchable and sortable column called Connection Name to help identify your datasets more easily.
- We aligned the roles and permissions requirements for the Dataset Manager API.
- PUT /v2/updatecatalogobj requires its users to have ROLE_ADMIN, ROLE_DATA_GOVERNANCE_MANAGER, or ROLE_DATASET_ACTIONS.
- PUT /v2/updatecatalog requires its users to be the dataset owner or have ROLE_ADMIN or ROLE_DATASET_ACTIONS.
- DELETE /v2/deletedataset requires its users to be the dataset owner or have ROLE_ADMIN or ROLE_DATASET_MANAGER.
- PATCH /v2/renameDataset requires its users to be the owner of the source dataset or have ROLE_ADMIN, ROLE_DATASET_MANAGER, or ROLE_DATASET_ACTIONS.
- POST /v2/update-run-mode requires its users to have ROLE_DATASET_TRAIN, ROLE_DATASET_ACTIONS, or dataset access.
- POST /v2/update-catalog-data-category requires its users to have ROLE_DATASET_TRAIN, ROLE_DATASET_ACTIONS, or dataset access.
- PUT /v2/business-unit-to-dataset requires its users to have ROLE_DATASET_ACTIONS or dataset access.
- POST /v2/business-unit-to-dataset requires its users to have ROLE_DATASET_ACTIONS or dataset access.
- POST /dgc/integrations/trigger/integration requires its users to have ROLE_ADMIN, ROLE_DATASET_MANAGER, or ROLE_DATASET_ACTIONS.
- POST /v2/postjobschedule requires its users to have ROLE_OWL_CHECK and either ROLE_DATASET_ACTIONS or dataset access.
Connections
- The authentication dropdown menu for any given connection now displays only its supported authentication types.
- We've upgraded the following drivers to the versions listed:
- Db2 4.27.25
- Snowflake 3.19.0
- SQL Server 12.6.4 (Java 11 only)
Note
If you use additional encryption algorithms for JWT authentication, you must set one of the following parameters during your deployment of Collibra Data Quality & Observability, depending on your deployment type:
Helm-based deployments
Set the following parameter in the Helm Chart:--set global.web.extraJvmOptions="-Dnet.snowflake.jdbc.enableBouncyCastle=true"
Standalone deployments
Set the following environment variable in the owl-env.sh:-export EXTRA_JVM_OPTIONS=”-Dnet.snowflake.jdbc.enableBouncyCastle=true"
Note
While the Java 8 version is not officially supported, you can replace the SQL Server driver in the /opt/owl/drivers/mssql folder with the Java 8 version of the supported driver. You can find Java 8 versions of the supported SQL Server on the Maven Repository.
Admin Console
- A new access control layer, Require DATASET_ACTIONS role for dataset management actions, is available from the Security Settings page. When enabled, a new out of the box role, ROLE_DATASET_ACTIONS, is required to allow its users to edit, rename, publish, assign data categories and business units, and enable integrations from the Dataset Manager.
- A new out of the box role, ROLE_ADMIN_VIEWER, allows users who are assigned to it to access the following Admin Console pages, but restricts access to all others:
- Actions
- All Audit Trail subpages
- Dashboard
- Inventory
- Schedule Restrictions
- Usage
Note Users with ROLE_ADMIN_VIEWER cannot access the pages to which the quick access buttons on the Dashboards page are linked.
Note Users with ROLE_ADMIN_VIEWER cannot add or delete schedule restrictions.
- You can now set both size- and time-based data purges from the Data Retention Policy page of the Admin Console. Previously, you could only set size-based data retention policies.
APIs
- We’ve made several changes to the API documentation. First, we aligned the role checks between the Product APIs (V3 endpoints) and the Collibra Data Quality & Observability UI. We’ve also enhanced the documentation in Swagger to include more detailed descriptions of endpoints. Lastly, we reproduced the Swagger documentation of the Product API in the Collibra Developer Portal to ensure a more unified user experience with the broader Collibra platform and allow for easier scalability of API documentation in the future.
Fixes
Integration
- The Overview - History section of Data Quality Job assets now displays the correct month of historical data when an integration job run occurs near the end of a given month.
Jobs
- DQ Jobs on SAP HANA connections with SQL referencing table names containing semicolons ; now run successfully when you escape the table name with quotation marks " ". For example, the SQL query
select * from TEST."SPECIAL_CHAR_TABLE::$/-;@#%^&*?!{}~\+="
now runs successfully. - You can now run DQ Jobs that use Amazon S3 as the secondary dataset with Instance Profile authentication.
Rules
- Native rules on DQ Jobs created on connections authenticated by password manager now run successfully and return all related break records when their conditions are met.
Alerts
- The Assignee column is no longer included in the alert email for Rule Status and Condition alerts with rule details enabled.
APIs
- When using the POST /v2/controller-db-export call on a dataset with an alert condition, then using the POST /v2/controller-db-import call, now returns a successful 200 response instead of an unexpected JSON parsing error.
Latest UI
- When running DQ Jobs on NFS connections, data files with the date format ${yyyy}${MM}${dd} within their file name are now supported.
- Native Rules now display the variable name of parameters such as @runId and @dataset with the actual value in the Condition column of the Rules tab on the Findings page.
- The Jobs Schedule page now shows the time zone offset (+ or - a number of hours) in the Last Updated column. Additionally, the TimeZone column is now directly to the right of the Scheduled Time column to improve its visibility.
- You can now sort columns on the Job Schedule page.
- The Agent Configuration, Role Management - Connections, Business Units, Inventory pages of the Admin Console now have fixed column headers and the Actions button and horizontal scrollbar are now visible at all times.
- After adding or deleting rules, the rule count on the metadata bar now reflect any updates.
Beta features
- The Rule Workbench now contains an additional query input field called “Filter,” which allows you to narrow the scope of your rule query so that only the rows you specify are considered when calculating the rule score. A filter query not only helps to provide a better representation of your quality score but improves the relevance of your rule results, saving both time and operational costs by reducing the need to create multiple datasets for each filter.
Important This feature is currently available as a public beta option. For more information about beta features, see Betas at Collibra.
Known limitations
- In the this release, table names with spaces are not supported because of dataset name validation during the creation of a dataset. This will be addressed in an upcoming release.
DQ Security
The following image shows a chart of Collibra DQ security vulnerabilities arranged by release version.
The following image shows a table of Collibra DQ security metrics arranged by release version.
Release 2024.09
Release Information
- Release date of Collibra Data Quality & Observability 2024.09: September 30, 2024
- Release notes publication date: September 5, 2024
Enhancements
Platform
- We added support for FIPS-compliant algorithms.
Integration
- Users of the Quality tab in Collibra Data Intelligence Platform who do not have a Collibra Data Quality & Observability account can now view the 7-day history of data quality scores, allowing you to monitor the health of your data over time.
- When running a DQ Job with back run and an active integration, the results of the back run are now sent to Collibra Data Intelligence Platform where they are stored in the DQ services history table.
Pushdown
- You can now scan for shapes in DQ Jobs created on Trino Pushdown connections.
Jobs
- You can now create DQ Jobs on SAP HANA tables that contain the special characters ::$/-;@#%^&*?!{}~+=
Note The special characters .() are not supported.
Findings
- When the dupelimit and dupelimiui limits on the Admin Limits page are both set to 30, the Findings page now limits the number of dupes findings marked on the Dupes tab to 30.
Alerts
- If your organization uses multiple web pods on a Cloud Native deployment of Collibra Data Quality & Observability, you now receive only one alert email when an alert condition is met.
APIs
- When dataset security is enabled on a tenant and a user whose roles meet the requirements of the Dataset Def API and has a their role assigned to the dataset, the API returns the expected results.
Fixes
Integration
- When setting up an integration, the Connections step now has a search component and pagination to prevent table load failure when the tables per schema size exceeds browser memory.
- The dimension type of Adaptive Rules (NULL, EMPTY, MIN, and so on) now correctly maps to the type and sub-type from the DQ dimension table.
- The predicate of custom rules is again included in rule asset attributes.
Pushdown
- You can again download CSV and JSON files containing rule break records of DQ Jobs created on Pushdown connections.
- The rule name no longer appears in the data preview and downloaded rule breaks of rule findings for DQ Jobs created on Pushdown connections.
- DQ Jobs created on Pushdown connections where columns with shape values that contain $ or ‘ now run successfully. Previously, such Jobs failed with an unexpected exception message.
- When running DQ Jobs created on Pushdown connections that scan multiple columns for duplicate values, the data of the columns now appears under the correct column name.
- When a DQ Job created on Pushdown connection contains 0 rows of data and one or more rules are enabled, the rules are now included in the Job run and displayed on the Findings page.
Note For consistency with rule break downloads in Pullup mode, we plan to separate rule breaks by rule in a future release. As of this release, Pullup mode still includes the rule name and runId in rule break downloads
Jobs
- The Explorer connection tree now loads successfully when a schema contains tables that contain unsupported column types.
- Dataset Overview on the Explorer page can now process the
select *
part of the SQL statement if there is an unsupported column type. - We fixed an issue on the Jobs page where Collibra Data Quality & Observability was unable to retrieve the Yarn Job ID.
- When re-running a DQ Job from the metadata bar on DQ Job that previously ran with backrun (
-br
), the DQ Job that you re-run will no longer incorrectly initiate a backrun.
Note If the -br
option is included in the beginning of your command line, your DQ Job will perform a backrun and -br
will be removed from the command line when the DQ Job completes.
Findings
- After retraining a behavioral finding to pass a value for a blindspot, the score now correctly reflects the retrained scoring model.
Profile
- When adding a stat rule for distribution from the +Add Rule option on the Profile page, the computed boundaries of categorical variables in the distribution rule now display correctly.
Rules
- When rules on Pullup datasets time out, the rule output record now displays the out-of-memory (OOM) exception message on the Findings page.
- Run Result Preview on the Rule Workbench now works as expected for custom rules that use the simple rule template.
- When creating a rule for an existing DQ Job created on a Pushdown connection, Run Result Preview now runs without errors.
- You can now use rules that contain a $ (not stat rules) with profiling off for DQ Jobs created on Pushdown connections.
Admin Console
- Admin Limits now require values to be -1, 0, or positive numbers. An inline message appears below the Value field when a limit does not meet the allowed values.
Latest UI
- We improved the performance of the Scorecards page when there are many datasets to load.
- We reduced the number of backend calls on the Profile and Findings pages to improve the load time performance.
- You can now create, edit, and delete alerts for datasets with 0 rows. When you run a job on a dataset with 0 rows, the alerts function as expected.
- Dataset Manager no longer crashes due to slow network calls when you click the Filter icon as the page loads.
- We resolved multiple scenarios where the metadata bar did not display any rows or columns.
- When hovering over a row on the Scheduler page, the row has a gray highlight, but the days of week cells remain green or white, depending on whether a DQ Job is scheduled to run on a given day.
- The date picker on the Job tab of the Findings page is now available for DQ Jobs created on Pushdown connections and you can successfully run DQ Jobs with the dates you select.
- The Findings and other pages now load correctly in Safari browsers.
- DQ Jobs created on Pushdown connections no longer generate duplicate user-defined job failure alerts.
- The donut charts in the database and schema reports from Explorer now consistently display the correct stats.
DQ Security
The following image shows a chart of Collibra DQ security vulnerabilities arranged by release version.
The following image shows a table of Collibra DQ security metrics arranged by release version.