Release 2024.11

Release Information

  • Release dates of Data Quality & Observability Classic:
    • November 25, 2024: Data Quality & Observability Classic 2024.11
    • December 11, 2024: Data Quality & Observability Classic 2024.11.1
  • Release notes publication date: October 31, 2024

Announcement

Important 

As a security measure, we are announcing the end of life of the Java 8 and 11 versions of Data Quality & Observability Classic, effective in the August 2025 (2025.08) release.

In the February 2025 (2025.02) release, Data Quality & Observability Classic will only be available on Java 17. Depending on your installation of Data Quality & Observability Classic, you can expect the following in the 2025.02 release:
  • Kubernetes installations: Kubernetes containers will automatically contain Java 17. You may need to update your custom drivers and SAML keystore to maintain compatibility with Java 17.
  • Standalone installations: You must upgrade to Java 17 to install Data Quality & Observability Classic 2025.02. Additional upgrade guidance will be provided upon the release date. We encourage you to migrate to a Kubernetes installation, to improve the scalability and ease of future maintenance.

The March 2025 (2025.03) release will have Java 8 and 11 versions of Data Quality & Observability Classic and will be the final release to contain new features on those Java versions. Between 2025.04 and 2025.07, only critical and high-priority bug fixes will be included in the Java 8 and 11 versions of Data Quality & Observability Classic.

Additional details on driver compatibility, SAML upgrade procedures, and more will be available alongside the 2025.02 release.

For more information, visit the Data Quality & Observability Classic Java Upgrade FAQ.

Enhancements

Platform

  • When querying the rule_output table in the Metastore, the rows_breaking and total_count columns now populate the correct values for each assignment_id. When a rule filter is used, the total_count column reflects the filtered number of total rows.

Integration

  • We automated connection mapping by introducing:
    • Automapping of schemas, tables, and columns.
    • The ability to view table statistics for troubleshooting unmapped or partially mapped connections.
  • Note Automapping is currently only available in multi-tenant environments. If your organization has a single-tenant environment, continue to use the manual mapping option until automapping is available in single-tenant environments in the 2025.03 Data Quality & Observability Classic release.

  • The Quality tab is now hidden when an aggregation path is not available. (idea #DCC-I-3252)

Pushdown

  • Snowflake Pushdown connections now support source to target analysis for datasets from the same Snowflake Pushdown connection.
  • You can now monitor advanced data quality layers for SAP HANA Pushdown connections, including categorical and numerical outliers and records.
  • Trino Pushdown connections now support multiple link IDs for dupes scans.

Connections

  • We now provide out of the box support for Cassandra and Denodo data source connections. You can authenticate both connection types with the basic username and password combination and password manager method.
  • You can now authenticate SQL Server connections with NTLM.
  • We upgraded the Snowflake JDBC driver to 3.20.0.

Jobs

  • You can now set a new variable, -rdAdj, in the command line to dynamically calculate and substitute the run date for the -rd variable at the run time of your DQ Job.
  • The metadata bar now displays the schema and table name.

Findings

  • If you assign multiple Link IDs in a Dupes configuration, each Link ID is now present in the break record preview.
  • When there are rule findings, the Breaking Records column on the Rules tab displays the number of rows that do not pass the conditions of a rule. In the Metastore, the values from the Breaking Records column are included in the rows_breaking column of the rule_output table. However, after initially upgrading to 2024.11, values in the rows_breaking column remain [NULL] until you re-run your DQ Job.
  • Important To include data from the rows_breaking column in a dashboard or report, you first need to re-run your DQ Job to populate the column with data.

Alerts

  • There are now 8 new variables that allow you to create condition alerts for the scores of findings that meet their criteria. These condition variables include:
    • behaviorscore
    • outlierscore
    • patternscore
    • sourcescore
    • recordscore
    • schemascore
    • dupescore
    • shapescore
  • Example To create an alert for shapes scores above 25, you can set the condition to shapescore > 25.

  • Job failure alerts now send when a DQ Job fails in the Staged or Initiation activities.

Dataset Manager

  • You can now edit and clone DQ Jobs from the Actions button in the far right column on the Dataset Manager.

Fixes

Integration

  • Data Quality Job assets now display a “No data quality score available” message when an invalid rule is selected.
  • When Data Quality & Observability Classic cannot retrieve the columns from a table or view during the column mapping process, the column UUIDs in Collibra Platform are now used by default.

Pushdown

  • You can now run Pushdown Jobs using OAuth Tokens generated by the /v3/auth/Oauth/signin endpoint.
  • Unique adaptive rules for Pushdown Jobs with columns that contain null values no longer fail when a scheduled run occurs.
  • When turning behavioral scoring off in the JSON definition of DQ Job created on Pushdown connections, behavior scores are no longer displayed.
  • When DQ Job created on Pushdown connections with Archive Break Records enabled run, references to link IDs in the rule query are now checked and added automatically if they are missing. This also allows you to add your own CONCAT() when using complex rules.
  • We improved the performance of DQ Jobs created on Snowflake Pushdown connections that use LIMIT 1 for data type queries.

Connections

  • We fixed a critical issue that prevented DQ Jobs on temp files from running because of a missing temp file bucket error.

Jobs

  • Backrun DQ Jobs are now included in the Stage 3 Job Logs.
  • Data Preview now works correctly when the source in the Mapping (source to target) activity is a remote file storage connection, such as Amazon S3.
  • DQ Jobs on Oracle datasets now run without errors when Parallel JDBC is enabled.
  • When using Dataset Overview to query an Oracle dataset, you no longer receive a generic "Error occurred. Please try again." error message when the source data contains a column with a "TIMESTAMP" data type.
  • When including any combination of the conformity options (Min, Mean, or Max) from the Adaptive Rules tab, the column of reference on the Shapes tab is no longer incorrectly marked “N/A” instead of “Auto.”
  • Shapes can now be detected after enabling additional Adaptive Rules beyond the default Adaptive Rules settings for file-based DQ Jobs.
  • After setting up a source to target mapping in the Mapping step of Explorer where both source and target are temp files, you no longer encounter a “Leave this Mapping” message when you click one of the arrow on the right side of the page to proceed to the next step.

Findings

  • After suppressing a behavior score for a dataset that you then use to create a scorecard, the scorecard and Findings page now reflect the same score.
  • After suppressing a behavior score and the total score is over 100, the new score is calculated correctly.

Rules

  • Rules with spaces in the link ID column now load successfully in the Rule Breaks preview. For example, the link ID column ~|ABC ~|, where ~| is the column delimiter, now loads successfully in the Rule Breaks preview.
  • When changing a rule type from a non-Native to Native rule, the Livy banner no longer displays and the Run Result Preview button is enabled. When changing any rule type to any other rule type that is non-Native, Livy checks run and the appropriate banner displays or the Run Result Preview button is enabled.

Alerts

  • When a single rule is passing after adding 3 distinct alerts for each Rule Status trigger (Breaking, Exception, and Passing) and one alert with all 3, unexpected alerts no longer send when the DQ Job runs.
  • Batch alerts now use the same alerts queue to process as all other alert emails.

APIs

  • The /v2/getdatapreview API is now crossed out and marked as deprecated in Swagger. While this API is now deprecated, it continues to function to allow backward compatibility and functionality in legacy workflows.
  • The Swagger UI response array now includes the 204 status code, which means that a request has been successfully completed, but no response payload body will be present.

Latest UI

  • When using Dataset Overview to query an Oracle dataset, you no longer receive a generic "Error occurred. Please try again." error message when the source data contains a column with a "TIMESTAMP" data type.
  • The Adaptive Rules modal on the Findings page now allows you to filter the results to display only Adaptive or Manual Rules or display both.
  • We re-added the ability to expand the results portion of the Findings page to full screen.
  • There is now an enhanced warning message when you create an invalid Distribution Rules from the Profile page.
  • The Select Rows step of Explorer now has a tooltip next to the Standard View option to explain why it is not always available.
  • The Actions button on the Dataset Manager now includes options to edit and clone DQ Jobs.
  • The Rule Details dialog now has a tooltip next to the "Score" buttons to explain the downscoring options.
  • We consolidated the individual login buttons on the Tenant Manager page to a single button that returns you to the main login page.
  • Table headers in exported Job Logs generated from the Jobs page now display correctly.

Features in preview

Rules

  • You can now apply a rule tolerance value to indicate the threshold above which your rule breaks require the most urgent attention. Because alerts associated with rules can generate many alert notifications, this helps to declutter your inbox and allows you to focus on the rule breaks that matter most to you.
  • Rule filtering is now available for Pushdown DQ Jobs.

Maintenance Updates

  • We added a new check on the flyway library to resolve issues upon upgrade to Data Quality & Observability Classic 2024.10.
  • Denodo connections now support OAuth2 authentication.
  • You can now Configure AWS passwordless authentication using Amazon RDS PostgreSQL as the Metastore using Amazon RDS PostgreSQL as the Metastore.
  • Note AWS passwordless authentication is currently only supported for EC2 Instance Profile-based authentication with an Amazon RDS Metastore for Data Quality & Observability Classic standalone and cluster-based deployments. IAM pod role-based authentication support will be available in a future release.