Release 2025.03
Release Information
- Release date of Collibra Data Quality & Observability 2025.03: March 31, 2025
- Release notes publication date: March 4, 2025
Announcement
Important
As a security measure, we are announcing the end of life of the Java 8 and 11 versions of Collibra Data Quality & Observability, effective in the August 2025 (2025.08) release.
In this release (2025.03), Collibra Data Quality & Observability is only available on Java 17 and Spark 3.5.3. Depending on your installation of Collibra Data Quality & Observability, you can expect the following in this release:
The April 2025 (2025.04) release will contain Java 8, 11, and 17 versions of Collibra Data Quality & Observability. This will be the final release to contain Java 8 and 11 builds and Spark versions older than 3.5.3, and will include feature enhancements and bug fixes from the 2025.02 release and critical bug fixes from the 2025.03 and 2025.04 releases. Between 2025.05 and 2025.07, only critical and high-priority bug fixes will be made for Java 8 and 11 versions of Collibra Data Quality & Observability. For a breakdown of Java and Spark availability in current and upcoming releases, click "See what is changing" below.
For more information, go to the Collibra Data Quality & Observability Java Upgrade FAQ.
As a security measure, we are announcing the end of life of the Java 8 and 11 versions of Collibra Data Quality & Observability, effective in the August 2025 (2025.08) release.
In this release (2025.03), Collibra Data Quality & Observability is only available on Java 17 and Spark 3.5.3. Depending on your installation of Collibra Data Quality & Observability, you can expect the following in this release:
- Kubernetes installations
- Kubernetes containers automatically contain Java 17 and Spark 3.5.3.
- If you use custom drivers, ensure they are compatible with Java 17 and Spark 3.5.3.
- If you use file-based SAML authentication with the
SAML_METADATA_USE_URL
variable set tofalse
in the owl-web ConfigMap, update the Meta-Data URL option on the SAML Security Settings page with your metadata file. Use the file:/opt/owl/config/idp-metadata.xml format, ensuring the file name begins with the prefix file:. For steps on how to configure this, go to the "Enable the SAML SSO sign in option" section in SAML Authentication.
- Standalone installations
- To install Collibra Data Quality & Observability 2025.03, you must upgrade to Java 17 and Spark 3.5.3 if you have not already done so in the 2025.02 release.
- If you use custom drivers, ensure they are compatible with Java 17 and Spark 3.5.3.
- Follow the latest steps to upgrade to Collibra Data Quality & Observability 2025.03 with Java 17.
- If you use file-based SAML authentication with the
SAML_METADATA_USE_URL
variable set tofalse
in the owl-env.sh script, update the Meta-Data URL option on the SAML Security Settings page with your metadata file. Use the file:/opt/owl/config/idp-metadata.xml format, ensuring the file name begins with the prefix file:. For steps on how to configure this, go to the "Enable the SAML SSO sign in option" section in SAML Authentication. - We encourage you to migrate to a Kubernetes installation to improve the scalability and ease of future maintenance.
The April 2025 (2025.04) release will contain Java 8, 11, and 17 versions of Collibra Data Quality & Observability. This will be the final release to contain Java 8 and 11 builds and Spark versions older than 3.5.3, and will include feature enhancements and bug fixes from the 2025.02 release and critical bug fixes from the 2025.03 and 2025.04 releases. Between 2025.05 and 2025.07, only critical and high-priority bug fixes will be made for Java 8 and 11 versions of Collibra Data Quality & Observability. For a breakdown of Java and Spark availability in current and upcoming releases, click "See what is changing" below.
For more information, go to the Collibra Data Quality & Observability Java Upgrade FAQ.
Enhancements
Platform
- On April 9, 2025, Google will deprecate the Vertex text-bison AI model, which SQL Assistant for Data Quality uses for the "beta path" option. To continue using SQL Assistant for Data Quality, you must switch to the "platform path," which requires an integration with Collibra Platform. For more information about how to configure the platform path, go to About SQL Assistant for Data Quality.
- We removed support for Kafka streaming.
Connections
- Private endpoints are now supported for Azure Data Lake Storage (Gen2) (ABFSS) key- and service principal-based authentication and Azure Blob Storage (WASBS) key-based authentication using the cloud.endpoint=<endpoint> driver property. To do this, add cloud.endpoint=<endpoint> to the Driver Properties field on the Properties tab of an ABFSS or WASBS connection template. For example, cloud.endpoint=microsoftonline.us.
- Trino now supports parallel processing in Pullup mode. To enable this enhancement, the Trino driver has been upgraded to version 1.0.50.
Jobs
- The Jobs tab on the Findings page now includes two button options:
- Run Job from CMD/JSON allows you to run a job with updates made in the job's command line or JSON. This option is Run Job from JSON in Pushdown mode since the command line option is not available.
- Run Job with Date allows you to select a specific run date.
Note The Run DQ Job button on the metadata bar retains its functionality, allowing you to rerun a job for the selected date.
- The order of link IDs in the rule_breaks and opt_owl Metastore tables for Pushdown jobs is now aligned.
- The options to archive the break records of associated monitors in the Explorer settings dialog box of a Pushdown job are now disabled when the Archive Break Records option is disabled at the connection-level.
- We updated the logic of the maximum global job count to ensure it only increases, rather than fluctuating based on the maximum count of the last run job's tenant. This change allows tenants with lower maximum job counts to potentially run more total jobs while still enforcing the maximum connections for individual jobs. Over time, the global job count will align with the highest limit among all tenants.
- You can now archive the break records of shapes from SAP HANA and Trino Pushdown jobs.
- You can now use the new
"behaviorShiftCheck"
element in the JSON payload of jobs on Pullup connections. This allows you to enable or disable the shift metric results of Pullup jobs, helping you avoid misleading mixed data type results in string columns. By default, the"behaviorShiftCheck"
element is enabled (set totrue
). To disable it, use the following configuration:“behaviorShiftCheck”: false
.
Rules
- You can now set the MANDATORY_PRIMARY_RULE_COLUMN setting to TRUE from the Application Configuration Settings page of the Admin Console to require users to select a primary column when creating a rule. This requirement is enforced when a user creates a new rule or saves an existing rule for the first time after the setting is enabled. Existing rules are not affected automatically.
- The names of Template rules can no longer include spaces.
- CSV rule export files now include a Filter Query column when a rule filter is defined. If no filter is used, the column remains empty. The Condition column has been renamed to Rule Query to better distinguish between rule and filter queries. Additionally, the Passing Records column now shows the correct values.
- You can now apply custom dimensions added to the dq_dimension table in the metastore to rules from the Rule Details dialog box on the Rule Workbench. These custom dimensions are also included in the Column Dimension Report.
- Livy caching now uses a combination of username and connection type instead of just the username. This improvement allows you to seamlessly switch between connections to access features such as retrieving the run results previews for rules or creating new jobs for remote file connections, without manually terminating sessions.
Note Manually terminating a Livy session will still end all sessions associated with that user.
Findings
- You can now work with the Findings page in full-screen view.
Scorecards
- You now receive a helpful error message in the following scenarios:
- Create a scorecard without including any datasets.
- Update a scorecard and remove all its datasets.
- Add or update a scorecard page with a name that already exists.
Fixes
Platform
- We have improved the security of our application.
- We mitigated the risk of SQL injection vulnerabilities in our application.
- Helm Charts now include the external JWT properties required to configure an externally managed JWT.
Jobs
- Google BigQuery jobs no longer fail during concurrent runs.
- When you add a
-conf
setting to the agent configuration of an existing job and rerun it, the command line no longer includes duplicate-conf
parameters. - When you expand a Snowflake connection in Explorer, the schema is now passed as a parameter in the query. This ensures the Generate Report function loads correctly.
- Record change detection now works as expected with Databricks Pushdown datasets.
- When you select a Parquet file in the Job Creator workflow, the Formatted view tab now shows the file’s formatted data.
- When you edit a Pullup job from the Command Line, JSON, or Query tab, the changes initially appear only on the tab where you made the edits. After you rerun the job, the changes are reflected across all three tabs.
- The Dataset Overview now performs additional checks to validate queries that don’t include a
SELECT
statement.
Rules
- When you update the adaptive level or pass value option in the Change Detection dialog box of an adaptive rule, you must now retrain it by clicking Retrain on the Behaviors tab of the Findings page.
- @t1 rules on file-based datasets with a row filter now return only the rows included in the filter.
- @t1 rules on Databricks datasets no longer return a NullPointerException error.
- When you run Rule Discovery on a dataset with the “Money” Data Class in the Data Category, the job no longer returns a syntax error when it runs.
Findings
- We updated the time zone library. As a result, some time zone options, such as "US/Eastern," have been updated to their new format. Scheduled jobs are fully compatible with the corresponding time zones in the new library. If you need to adjust a time zone, you must use the updated format. For example, "US/Eastern" is now "America/New_York."
- Labels under the data quality score meter are now highlighted correctly according to the selected time zone of the dataset.
Alerts
- You no longer receive erroneous job failure alerts for successful runs. Additional checks now help determine whether a job failed, improving the accuracy of job status notifications.
- You can now consistently select or deselect the Add Rule Details option in the Condition Alert dialog box.
Reports
- The link to the Dataset Findings documentation topic on the Dataset Findings report now works as expected.
Connections
- Editing an existing remote file job no longer results in an error.
- Teradata connections now function properly without requiring you to manually add the STRICT_NAMES driver property.
APIs
- When you run a job using the /v3/jobs/run API that was previously exported and imported with /v3/datasetDefs, the Shape settings from the original job now persist in the new job.
- Bearer tokens generated in one environment using the /v3/auth/signin endpoint (for local users) or the /v3/auth/oauth/signin endpoint (for OAuth users) are now restricted to that specific Collibra Data Quality & Observability environment and cannot be used across other environments.
- We improved the security of our API endpoints.
Integration
- You can now use the automapping option to map schemas, tables, and columns when setting up an integration between Collibra Data Quality & Observability and Collibra Platform in single-tenant Collibra Data Quality & Observability environments.
- The Quality tab now correctly shows the data quality score when the head asset of the starting relation type in the aggregation path is a generic asset or when the starting relation type is based on the co-role instead of the role of the relation type.
- Parentheses in column names are no longer replaced with double quotes when mapped to Collibra Platform assets. This change allows automatic relations to be created between Data Quality Rule and Column assets in Collibra Platform.