Release 2025.04 (Java 8 and 11 only)
Release Information
- Release date of Collibra Data Quality & Observability 2025.04: April 28, 2025
- Release notes publication date: April 2, 2025
Announcement
As a security measure, we are announcing the end of life of the Java 8 and 11 versions of Collibra Data Quality & Observability, effective in the August 2025 (2025.08) release.
In this release (2025.04), only the Java 17 build profile of Collibra Data Quality & Observability contains all new and improved features and bug fixes listed in the 2025.04 release notes. The Java 8 and 11 build profiles for Standalone installations contain the 2025.02 release and critical bug fixes addressed in 2025.03 and 2025.04. They do not contain any feature enhancements from the 2025.03 or 2025.04 releases.
While this release contains Java 8, 11, and 17 builds of Collibra Data Quality & Observability for Standalone installations, it is the final release to contain Java 8 and 11 builds and Spark versions older than 3.5.3. Between 2025.05 and 2025.07, only critical and high-priority bug fixes will be made for Java 8 and 11 versions of Collibra Data Quality & Observability.
For a breakdown of Java and Spark availability in current and upcoming releases, click "See what is changing" below.
Enhancements
Platform
- When an externally managed user enables assignment alert notifications and enters an email address on their user profile, findings assigned to that email address now receive alert notifications.
- SAML_ENTITY_BASEURL is now a required property for SAML authentication.
- To improve the security of our application, we upgraded SAML. As part of the upgrade, the following SAML Authentication are no longer used:
- SAML_TENANT_PROP_FROM_URL_ALIAS
- SAML_METADATA_USE_URL
- SAML_METADATA_TRUST_CHECK
- SAML_INCLUDE_DISCOVERY_EXTENSION
- SAML_SUB_DOMAIN_REPLACE_SOURCE
- SAML_SUB_DOMAIN_REPLACE_TARGET
- SAML_MAX_AUTH_AGE
Note If the SAML_LB_EXISTS property is set to true and SAML_LB_INCLUDE_PORT_IN_REQUEST set to false, you may need to update SAML_ENTITY_BASEURL to include the port in the URL. The SAML_ENTITY_BASEURL should match the IdP's ACS URL.
Additionally, we recommend using the Collibra DQ UI when signing in with SAML, as SP-initiated sign-on is not fully supported in Collibra DQ.Tip We recommend synchronizing the clocks of the Identity Provider (IdP) and Service Provider (Collibra DQ) using Network Time Protocol (NTP) to prevent authentication failures caused by significant clock skew.
Connections
- Private endpoints are now supported for Azure Data Lake Storage (Gen2) (ABFSS) key- and service principal-based authentication and Azure Blob Storage (WASBS) key-based authentication using the cloud.endpoint=<endpoint> driver property. To do this, add cloud.endpoint=<endpoint> to the Driver Properties field on the Properties tab of an ABFSS or WASBS connection template. For example, cloud.endpoint=microsoftonline.us.
- You can now limit the schemas that are displayed in a JDBC connection in Explorer. This enhancement helps you manage usage and maintain security by restricting access to only the necessary schemas. (idea #CDQ-I-152)
- Hive connections now use the driver class name,
com.cloudera.hive.jdbc.HS2Driver
. Additionally, only Hive versions 2.6.25 and newer are supported.
Note When you include a restricted schema in the query of a DQ Job, the query scope may be overwritten when the job runs. While only the schemas you selected when you set up the connection are shown in the Explorer menu, users are not restricted from running SQL queries on any schema from the data source.
Warning If you have a Standalone installation of Collibra Data Quality & Observability and leverage Hive connections, you must upgrade to the latest 2.6.25 Hive driver to use Collibra Data Quality & Observability 2025.02.
Jobs
- The Jobs tab on the Findings page now includes two button options:
- Run Job from CMD/JSON allows you to run a job with updates made in the job's command line or JSON. This option is Run Job from JSON in Pushdown mode since the command line option is not available.
- Run Job with Date allows you to select a specific run date.
Note The Run DQ Job button on the metadata bar retains its functionality, allowing you to rerun a job for the selected date.
- DQ Jobs on Trino Pushdown connections now allow you to select multiple link ID columns when setting up the dupes monitor.
Rules
- We are delighted to announce that rule filtering is now generally available. To use rule filtering, an admin needs to set RULE_FILTER to TRUE in the Application Configuration.
- We are also pleased to announce that rule tolerance is generally available.
- You can now define a DQ Dimension when you create a new template or edit an existing one to be applied to all new custom rules created using this template. Additionally, a Dimension column is now shown on the Templates page.
- There is now a dedicated break_msg column in the rule_output Metastore table, which shows the break message when a rule break occurs.
Dataset Manager
- You can now add meta tags up to 100 characters long from the Edit option under the Actions menu on the Dataset Manager. (idea #CDQ-I-74)
Fixes
Platform
- You can now use the Bouncy Castle FIPS provider for Standalone deployments of Collibra Data Quality & Observability.
- To improve the security of our application, Collibra Data Quality & Observability now supports a Content Security Policy (CSP).
- We have improved the security of our application.
Jobs
- When a scheduled job runs, only the time it is scheduled to run displays in the Scheduled Time column on the Jobs page.
- When you add a
-conf
setting to the agent configuration of an existing job and rerun it, the command line no longer includes duplicate-conf
parameters. - When you edit and re-run a DQ Job with a source-to-target mapping of data from two different data sources, the
-srcds
parameter on the command line now correctly contains thesrc_
prefix before the source dataset, and the DQ Job re-runs successfully. - When you include a custom column via the source query of a source-to-target mapping, the validate overlay function no longer fails with an error message.
- When you clone a dataset from Dataset Manager, Explorer, or the Job tab on Findings, the Job Name field and the Command Line in the Review step of Explorer now correctly reflect the "temp_cloned_dataset" naming convention for cloned datasets.
- DQ Jobs with the same run date can no longer run concurrently.
- Pullup DQ Jobs with names that include dashes, such as "sample-dataset," no longer remain stalled in the "staged" activity when you rerun them.
- The correlation and histogram activities no longer cause DQ Jobs on Databricks Pushdown connections to fail sporadically.
- The correlation activity now displays correctly in the UI and no longer causes DQ Jobs on Pushdown connections to fail.
Rules
- When you use Freeform SQL queries with
LIKE %
conditions, for example,SELECT * FROM @public.test where testtext LIKE '%a, c%'
, they now return the expected results. - SQL Server job queries where the table name is escaped with brackets, for example,
select * from dbo.[Table]
, now process correctly when the job runs. - The Rule Results Preview button is no longer disabled when the API call to gather the Livy status fails due to an invalid or non-existent session. The API call now correctly manages the cache for Livy sessions terminated due to idle timeout.
- @t1 rules on file-based datasets with a row filter now return only the rows included in the filter.
- @t1 rules on Databricks datasets no longer return a NullPointerException error.
- The contents of the Export Rules with Details now include data from the new Tolerance rule setting.
- Data type rules now evaluate the contents of each column, not just the data type of each column to ensure the correct breaks are returned.
- When you run Rule Discovery on a dataset with the “Money” Data Class in the Data Category, the job no longer returns a syntax error when it runs.
- SQL reserved keywords included in string parsing are now correctly preserved in their original case.
- When you update the name of a rule and an alert is configured on it, the alert will now show the updated name when sent.
- When you update the adaptive level or pass value option in the Change Detection dialog box of an adaptive rule, you must now retrain it by clicking Retrain on the Behaviors tab of the Findings page.
Findings
- Labels under the data quality score meter are now highlighted correctly according to the selected time zone of the dataset.
Alerts
- When you set a breaking rule as passing, a rule status alert for rules with breaking statuses no longer sends erroneous error messages for that rule.
Connections
- When you substitute a PWD value as a sensitive variable on the Variables tab of a Databricks connection template, the sensitive variable in the connection URL is now set correctly for source-to-target mappings where Databricks is the source dataset.
- Editing an existing remote file job no longer results in an error.
- Teradata connections now function properly without requiring you to manually add the STRICT_NAMES driver property.
APIs
- The /v3/jobs/<jobId>/breaks/rules endpoint no longer returns a 500 error when using a valid jobId. Instead, it now returns empty files when no results are found for exports without findings.
- When you run a job using the /v3/jobs/run API that was previously exported and imported with /v3/datasetDefs, the Shape settings from the original job now persist in the new job.
- When you schedule a Job to run monthly or quarterly, the
jobSchedule
object in the /v3/datasetDefs endpoint now reflects your selection. - When you run the /v3/rules/{dataset}/{ruleName}/{runId}/breaks endpoint when Archive Rules Break Records is enabled, break records are now retrieved from the source system instead of the Metastore.
- Meta tags are now correctly applied to new datasets that are created with the /v3/datasetDefs endpoint.
- When an integration import job fails, the /v3/dgc/integrations/jobs endpoint now returns the correct “failed” status. Additionally, the integration job status “ignored” is now available.
Integration
- You can now use the automapping option to map schemas, tables, and columns when setting up an integration between Collibra Data Quality & Observability and Collibra Platform in single-tenant Collibra Data Quality & Observability environments.
- An invalid RunID no longer returns a successful response when using a Pushdown dataset with the /v3/jobs/run endpoint.
- Parentheses in column names are no longer replaced with double quotes when mapped to Collibra Platform assets. This change allows automatic relations to be created between Data Quality Rule and Column assets in Collibra Platform.