Release Notes

2022.11

Warning 
The MS SQL driver that comes with JDK11 standalone packages does not currently work in the JDK11 environment. MSSQL requires a separate JAR for JDK11. Please contact your Customer Success Manager for the compatible driver.

Dremio is not currently supported for JDK11 standalone packages. If you plan to run JDK11, add -Dcdjd.io.netty.tryReflectionSetAccessible=true to owlmanage.sh as a JVM option for your Web and Spark instances. Please contact your Customer Success Manager for assistance.

Dremio jobs currently fail on both K8s and standalone JDK11 deployments. Add the following config to the Free Form (Appended) field of the Agent Configuration template: -conf spark.driver.extraJavaOptions=-Dcdjd.io.netty.tryReflectionSetAccessible=true.

As of October 18, 2022, all images for the 2022.10 release have a Critical CVE (CVE-2022-42889). If you picked up the 2022.10 release before October 18, 2022, there should be no issue with your scans. If issues persist, please contact your Customer Success Manager for a new build.

Note 
After you complete an upgrade or a new installation of Collibra DQ, you are now required to enter a license name by following either a one-time prompt on the login page, entering the LICENSE_NAME environment variable in the environment variable file (owl-env.sh), or by entering the global.configMap.data.license_name Helm chart variable. Your license name is the value after YOUR NAME IS = found in the license provision email sent to you by Collibra. Customers who do not have this information due to being issued a license before March 2022 should input license information following the format below.

For a single instance: <yourcompanyname>
For multiple instances: <yourcompanyname>-dev, <yourcompanyname>-test, <yourcompanyname>-prod
No spaces or special characters are permitted except for hyphens -.

New Features

Platform

  • The following pages now support the new React MUI:
    • Scorecards
    • List View
    • Assignments
    • Pulse View
    • Alerts

    Note React is turned off by default for the 2022.11 release. If you would like to try the new React pages, you can toggle it on from the Admin Console, or contact your Customer Success Manager for assistance.

DQ Job

  • You can now terminate jobs from the Jobs page if they are in progress, incorrectly submitted, or stuck in Staged status. When you terminate a job, two alerts are generated.
    • Jobs in the Spark UI display Finished statuses, even though they are terminated from the DQ UI.

Alerts

  • You can now generate alerts for the following stale data stat rules:
    • $daysWithoutData
    • $runsWithoutData
    • $daysSinceLastRun
  • You can now generate alerts for jobs stuck in Staged status for more than one hour.

Admin

  • You can now configure LDAP for user access in multi-tenant environments.

Connections

  • You can now use key-pair authentication for Snowflake connections.
    • When you append to the Connection URL string, your entry must be comma separated.
    • When you manually modify the Driver Properties field, your entry must be semicolon separated.
  • CDATA connections are now supported in standalone deployments.
    • CDATA drivers are now included in the release package.

Cloud Storage

  • Azure Blob Storage is now a supported target storage system.

Snowflake Pushdown (beta)

  • Schema Change monitoring from the AdaptiveRules tab is now enabled by default.
    • Schema is now separated from basic profiling.
  • The new DatasetDefDTO API now returns Pushdown information.
  • Dataset security checks are now implemented for Pushdown jobs.

Enhancements

Explorer

  • The Job Estimate dialogue now has improved guidance on executors and cores. The Job Estimate now estimates when a max core, max executor, and max memory is reached.

DQ Job

  • Job schedule time zone is now a read-only field and can no longer be configured. Existing scheduled jobs reflect their current settings, but all other scheduled jobs are now based on the time zone of the DQ server (UTC). (ticket #88797, 89736, 92611, 95231)

Dupes

  • A new warning message now displays when increasing the duplicate check limit from the UI. (ticket #95604)

Security

  • Kubernetes service accounts associated with AWS IAM pod roles for controlling access to AWS services for cloud native DQ deployments on AWS EKS are now supported.
  • When DATASET SECURITY is enabled, DATASET ACCESS is now required to edit, map, or retrieve datasets or business units. (ticket #92934)

Fixes

Rules

  • Fixed an issue that prevented freeform rules containing double backslashes from saving. (ticket #96636, 96640)
  • Fixed an issue that caused rules containing open brackets ([ ) to display break records incorrectly. (ticket #94399)
  • Fixed an issue that caused rules containing regex to throw out of range exceptions. (ticket #98435)

DQ Job

  • Fixed an issue where run time was not displayed on the findings page because run_id column type in the metastore did not include time zone. (ticket #96050)
  • Fixed an issue that caused Parquet files to fail during the LOAD activity. (ticket #96191)
    • Other NFS file types, including ORC, CSV, and Avro, also run successfully.

Alerts

  • Fixed an issue when saving batch names that used spaces between delimiters, which caused an invalid error to occur. (ticket #97028)

Validate Source

  • The Add Column Names feature is now removed from the Source tab. (ticket #96066)
    • Instead, use the query to edit/limit columns or use Update Scope.
  • Fixed an issue where disabling source check on a cloned dataset resulted in an error. You can now disable source validation on cloned datasets. (ticket #97795)

Dupes

  • The Advanced Filter is now hidden from the Dupes tab. (ticket #96065)

Shapes

  • Fixed an issue when editing a dataset that reverted the Shape Detection setting (Off, Auto, or Manual) applied when it was created. (ticket #95471, 95473)

Schema

  • Fixed an issue with schema detection on files where schema detection was performed on all columns when a subset of columns was selected. (ticket #92476)
    • Use theheadercheckoff flag when it is necessary to see only when columns are added or dropped.
  • Fixed an issue where schema changes were not correctly identified and updated. (ticket #96013)

Behavior

  • Fixed an issue with behavior lookback(-bhlb) that caused Row Count changes to be misrepresented. (ticket #94840)

Connections

  • Azure Blob connections in standalone environments require the following jars to be added to the $SPARK_HOME/jars folder:
    • hadoop-azure-3.2.0.jar
    • wildfly-openssl-1.1.3.Final.jar

API

  • Fixed an issue with the DB import process to ensure JobSchedule records import without error. (ticket #98405)

Known Limitations

DQ Job

  • Job termination is not supported for jobs in Unknown status.

Validate Source

  • Cloning and saving, enabling, or disabling the source tab is associated with the original dataset name and fails on the screen when an update is made, but does not affect the actual job run.

Connections

  • When adding driver properties using the +Add Property option for Snowflake connections, semicolons are incorrectly appended to key values. Instead, use comma format to separate key values.

DQ Security Metrics

Security vulnerabilities over 5 months

Critical vulnerabilities over 5 months

2022.10

New Features

Warning  For the Collibra Data Quality 2022.10 release, all Docker images run on JDK11. Standalone packages contain JDK8 and JDK11 options. If you are an existing customer who requires JDK11, please upgrade your runtime before upgrading to 2022.10. Most Hadoop environment versions (EMR/HDP/CDH) still run on JDK8, so customers using these environments can upgrade with the JDK8 packages. If you prefer to upgrade to JDK11, you must follow the documentation of your respective Hadoop environment to upgrade to JDK11 before deploying the 2022.10 release.

The MS SQL driver that comes with JDK11 standalone packages does not currently work in the JDK11 environment. MSSQL requires a separate JAR for JDK11. Please contact your Customer Success Manager for the compatible driver.

Dremio is not currently supported for JDK11 standalone packages. If you plan to run JDK11, add -Dcdjd.io.netty.tryReflectionSetAccessible=true to owlmanage.sh as a JVM option for your Web and Spark instances. Please contact your Customer Success Manager for assistance.

As of October 18, 2022, all images for the 2022.10 release have a Critical CVE (CVE-2022-42889). If you picked up the 2022.10 release before October 18, 2022, there should be no issue with your scans. If issues persist, please contact your Customer Success Manager for a new build.

Rules

  • You can now define a rule to detect the number of days a job runs without data by using $daysWithoutData.
  • You can now define a rule to detect the number of days a job runs with 0 rows by using $runsWithoutData.
  • You can now define a rule to detect the number of days since a job last ran by using $daysSinceLastRun.

Profile

  • You can now use a string length feature by toggling the Profile String Length checkbox when you create a data set.
    • When Profile String Length is checked, the min/max length of a string column is saved to table dataset_field

Validate Source

  • You can now write rules against a loaded source data frame when -postclearcache is configured in the agent.

Note The DQ UI will be converted to the React MUI framework with the 2022.11 release. Prior to the 2022.11 release, you can turn the React flag on, but note that some features may be temporarily limited.

Enhancements

DQ Job

  • Start Time and Update Time are now based on the server time zone of the DQ Web App.

Scheduler

  • The Job Schedule page now has pagination.

Scorecards

  • From Pulse View, you can now view missing runs, runs with 0 rows, and runs with failed scores.

Admin/Catalog

  • Connection details are now masked when non-admin users attempt to view or modify database connection details from the Catalog page. Only users with role_admin or role_connection_manager have the ability to view connection details on this page. (ticket #94430)

API

  • The /v2/getRunIdDetailsByDataset endpoint now provides the following:
    • The RunIDs for a given data set.
    • All completed DQ Jobs for a given data set.

Snowflake Pushdown (beta)

  • You can now detect shapes that do not conform to a data field. Pushdown jobs scan all columns for shapes by default.
  • You can now view Histogram and Data Preview details for the Profile activity.

Connections

  • The Snowflake JDBC driver is now updated to 3.13.14.

Fixes

Rules

  • Fixed an issue with the Rule Validator that resulted in missing table errors. The Validator now correctly detects columns. (ticket #93430)

DQ Job

  • Fixed an issue that caused queries with joins to fail on the load activity when Full Profile Pushdown was enabled. Pushdown profiling now supports SQL joins. (ticket #92409)
  • Fixed an issue that caused jobs to fail at the load activity when using the CTE query. Please note that CTE support is currently limited to Postgres connections. (ticket #88287, 89150)
  • Fixed an issue that caused inconsistencies between the time zones represented in the Start Time and Update Time columns.

Agent

  • Fixed the loadBalancerSourceRanges for web and spark_history services in EKS environments. (ticket #95398)
    • The helm property global.ingress.* has been removed to separate the config for web and spark_history. Please update the property as follows:__global.web.ingress.*``global.spark_history.ingress.*
  • Added support to specify the inbound CIDRs for the Ingress using the property .global.web.service.loadBalancerSourceRanges. (ticket #95398)
    • Though Ingress is supported as part of Helm charts, we recommend attaching your own Ingress to the deployment if you need further customization.
    • This requires a new Helm chart.
  • Fixed an issue that caused Livy file estimates to fail for GCS on K8s deployments.
  • Fixed an issue that caused jobs to fail for GCS on K8s deployments.

Validate Source

  • The Add Column Names feature is scheduled for removal with the upcoming 2022.11 release. (ticket #96066)
    • This was a previous functionality before being able to limit the query directly (srcq) and Update Scope was added.
    • Use the query to edit/limit columns and also use Update Scope.
  • Fixed an issue that caused the incorrect message to display for [VALUE_THRESHOLD] when validate source was specified for a matched case. (ticket #94435)

Dupes

  • The Advanced Filter is scheduled for removal from the Dupes page with the upcoming 2022.11 release. (ticket #96065)

Explorer

  • Fixed an issue that caused BigQuery connections to incorrectly update the library (-lib) path when a subset of columns was selected. (ticket #96768)

Scheduler

  • Fixed an issue that prevented the scheduler from running certain scheduled jobs in multi-tenancy setups. Email server information is now captured from the correct tenant. (ticket #92898)

Known Limitations

Rules

  • When a data set has 0 rows returned, stat rules applied to the data set are not executed. While a full fix is planned for a future release, this limitation is only partially fixed as of 2022.10.

DQ Job

  • CTE query support is currently limited to Postgres connections. DB2 and MSSQL are currently unsupported.

Catalog

  • When using the new bulk actions feature, updates to your job are not immediately visible in the UI. Once you apply a rule, run a DQ Job against that data set. From the Rules tab, a row with the newly applied rule is visible.

Snowflake Pushdown (beta)

  • Freeform (SQLF) rules cannot use a data set name but instead must use @dataset because Snowflake does not explicitly understand data set names.
  • When using the SQL Query workflow, selecting a subset of columns in your SQL query must be enclosed in double quotes to prevent the job from running infinitely and without failing.
  • Min/Max precision and scale are only calculated for double data types. All other data types are currently out of scope.

DQ Security Metrics

vulnerabilities over 5 months

critical vulnerabilties over 5 months

2022.09

Enhancements

Rules

  • The Conditions column on the Rules tab now displays SQLG and SQLF rule definitions on hover.

DQ Job

  • The Jobs chart now shows a dotted gray line to represent jobs in Submitted status.
  • The Jobs chart now supports an hourly view option.
  • When you run a Pushdown Job that has a data set that returns 0 rows, an unclear message displays.

Schema

  • From the Config tab in Explorer, a Check Header checkbox under DQ Job is now available for when column names contain special characters. The Check Header checkbox is checked by default.
    • When checked, schema findings do not display when detected.
    • When unchecked, schema findings display when detected.

Behavior

  • Mean values are now rounded on the Findings page.

Explorer

  • SOH delimiters for files are now supported.
  • The Only checkbox on all Build Layer tabs is now removed.
  • The Profile activity is now always enabled and no longer has an on/off switch.

Alerts

  • Only one email per alert is now sent when alerts are set up for a scheduled job.
  • You can now check the logs to see when an alert does not send in order to resend the email.

Scheduler

  • The findings page now displays a green indicator next to the Schedule icon when you schedule a job to run automatically. If Scheduler is inactive, a red indicator displays.

API

  • The v2/gethoot API now properly returns rule dimension information for data sets. (ticket #89973)

Connections

  • The Databricks connection template has changed, due to an upgrade of the driver. Any existing connection that uses the old driver must be updated. Refer to the new template. (ticket #19950)
  • The drivers for Athena, BigQuery, MongoDB, GCS, Hive/Impala were also upgraded but no connection change is required.

Spark

  • The 2022.11 release uses Spark 3.2.2.

Note We recommend using Spark 3.x for standalone installs/upgrades.

Fixes

Explorer

  • Fixed an issue that prevented the Job Estimator from properly displaying row estimates when the run date was modified during a new job run. (ticket #90860)
  • Fixed an issue that prevented DQ jobs created using NFS connection types from displaying under the Remote File Connections dropdown. (ticket #92479)
  • Fixed an issue that caused the file type parser to throw an error message when the default comma delimiter was not detected. The parser now detects a file's delimiter and updates the delimiter type in the UI automatically. (ticket #89489, 92480)

Files

  • The error message for Failed Merging Schema now has extra logging to clarify the cause of failed schema merges for both Livy sessions and non-Livy paths. (ticket #92694)

Security

  • Fixed an issue with the v2/getcatalogtableshasrulesfromcxn API that triggered a 403 status code when Dataset Security was enabled. (ticket #93298, 94258)

Agent

  • Fixed an issue that caused the Agent Check to no longer attempt check-ins to the metastore on K8s deployments, which resulted in red (unhealthy) status. (ticket #92055, 92963)
  • Fixed an issue that prevented concurrent users from properly running Livy sessions. (ticket #92963, 90432)

Known Limitations

Rules

  • The Rule Builder page becomes unusable if the user creates, validates, saves a new rule and then re-edits.

    • The workaround for this limitation is to do a full page refresh.

  • When a user attempts to validate a rule that contains a stat, an exception error is returned.

Security

  • The Assignments Queue feature is only available for local users. Support for externally connected users, such as SAML and AD connector, is not currently available.

Alerts

  • When alert recipient email addresses are separated by semicolons ;, alerts emails are not sent to the intended recipients.
    • A workaround for this limitation is to separate alert recipient email addresses with commas , instead of semicolons.

Snowflake Pushdown

  • When a Job is run, which has a data set that returns 0 rows, an unclear message displays.

  • When a native rule is created that contains an embedded stat, its calculated value will not display on the Job results page.

  • Data Set security is not supported.

  • Disabling autometrics will not take effect, therefore, all autometrics are executed.

  • Creating a DQ job using only "SQL Query" workflow doesn't allow you to set the rundate value.

DQ Security Metrics

DQ security vulnerabilities over 5 months

Critical security vulnerabilities over 5 months

2022.08

New Features

Rules

Enhancements

Connections

  • You can now authenticate Oracle JDBC connections with Kerberos TGT, Keytab, and Password. (tickets #75267, 76030)
  • You can now authenticate SQL Server JDBC connections with Kerberos Keytab in addition to basic authentication.

Rules

  • Rule Summary enhancements:
    • You can now select different time periods for analysis.
    • You can now view charts from three different pages, including Rule Detail Summary, Rule Breaks, and Rule Dimension Summary.

Security

  • Vulnerabilities identified by Jfrog
    • Vulns 0, criticals 0, high severity 7
    • The majority of the current mediums are due to merging the dq-streaming module into core.
    • For a visual readout, see the DQ Security Metrics section below.

Agent

  • You can now optionally configure individual time zones of DQ Job, Web, and Agent. You should only use this configuration when your instance and containers run in different system time zones. (tickets #87024, 87155)

Behavior

  • The Behavior tab now has a new column, Delta Percent Change (Δ % Change).
  • You can now hover over new tooltips in the following columns:
    • Baseline
    • % Change
    • Δ % Change
    • Zscore
    • Score

Outliers

  • Outlier checks are now optimized to skip in certain circumstances. Outlier checks are only skipped when the history load of a specified date column is empty.
  • You can now update and modify record flags from the command line with -rc, -rcKeys, -rcDateCol, and -rcTbin.

API

  • The v2/gethoot API now properly returns rule dimension information for data sets.
  • The v3/jobs/run API now has improvements to the 400 Bad Request error messages in specific circumstances.

Reports

  • The PDF option is now removed from the Data Set Findings page. To print dynamic column tables, use CSV or Excel options instead. (ticket #89739)

DQ Connector

  • The version of Collibra Integration Library is now updated to 2.4.12.

Fixes

Connections

  • The new GCS jars are required to use GCS spark-history-server. (ticket #90623)

DQ Job

  • Fixed an issue that caused jobs using .TXT files to incorrectly render custom column names. (ticket #81808)
    • Files with .TXT extensions are now treated as delimited files. Files with .TXT extensions that are not delimited files should use their respective file type from the file type dropdown.
  • Fixed an issue with deployments on K8s where jobs failed when the volume name exceeded 63 characters. (ticket #85372)

Agent

  • Fixed an issue that caused the v2/updateagent API to fail when numCores was empty. (tickets #89737, 92404, 92680)
    • The numCores field is no longer a required field.

Validate Source

  • Fixed an issue that caused validate source jobs to fail when the pkey was mapped to different column names. (ticket #88778)

Rules

  • When using Freeform SQL rules with wild-card operators, rules again correctly pass validation. (ticket #89644)
  • Fixed an issue with regex rules that use the characters ), , , and ; in the rlike, which caused DQ to append spaces to those characters and prevented the regex from operating correctly. (tickets #89417, 92958)
  • Fixed an issue that caused rules with column values containing parentheses ( ) to break due to the addition of padding before and after closing parentheses. (ticket #85176)
  • Fixed an issue that caused rules with special characters such as @ to display incorrectly on the Rules page, Conditions tab, and when exported to Excel.
  • Fixed an issue that prevented data sets with attached rules and roles from being renamed. (tickets #85731, 92059, 94315)

Profile

  • Fixed an issue where certain results in TopN Values and Data Preview displayed in scientific notation. Scientific notation is now removed from the display. (tickets #82163, 89738)

Explorer

  • Fixed an issue that allowed CLOB data types to be visible in the Drag Columns to Target map in the Source tab. (ticket #86902)

API

  • The REST API endpoint v2/updateRoleDatasets again correctly saves roles to data sets.

Known Limitations

Rules

  • The Findings page displays results from computational stat rules on mean as a single-quote string. For example, '573523.87' > 6763
  • Column-level sorting for the Rule Summary feature is not currently available.

Admin

  • When adding a Sensitive Label or a Data Category, the Edit and Update functions do not display the selected record. To properly display the record, you must first refresh the page before editing or updating.

Session Activity

  • While the application UI is being redesigned, it is possible that when the application times out on the legacy side of the application, you might not be able to see it on the new React MUI side. This can happen when you have the DQ application open on multiple tabs.
    • We are not currently tracking session timeout from the legacy UI to React.

Beta features

DQ Job

  • Collibra is proud to launch a brand new feature, Snowflake Pushdown. Snowflake Pushdown allows for even faster processing and removes the need to set up a separate Spark compute platform to run Collibra Data Quality. Snowflake Pushdown is a private beta feature only available by request. Since this is a beta feature, some limitations are expected as we continue to improve its functionality. Contact your CSM to learn more about this feature.

DQ Security Metrics

Warning  There is a critical CVE CVE-2016-1000027 that shows up in the image scan due to Spring version. This is a false positive and should be added to the exception list of the customer scan tools. We don’t use HttpInvokerServiceExporter anywhere in the application and are not impacted by it.

DQ security vulnerabilities over 5 months

Critical security vulnerabilities over 5 months

2022.07

Note Standalone packages for the 2022.07 release have a version naming convention of -RC. This will revert back to the standard naming convention with the 2022.08 release, and has no impact on the safety or stability of standalone packages. {% endhint %}

Fixes / Enhancements

  • DQ Job
    • Fixed an issue that prevented data from appearing in the Source tab when Source Observation RunID was clicked from the Assignments page.
    • Fixed an issue that caused Annotations with special characters to be truncated in the Labels tab.
    • Fixed an issue that caused the Column (name) column of the Rules tab to display incorrectly when Run Discovery was used.
    • Fixed an issue where the Retrain button on the Record tab was disabled.
    • You can again invalidate observations with single quotes ' from the Shapes tab.
    • The Hints tab now displays any available data.
    • You can no longer change agents from the Scheduler modal.
  • Rules
    • SQLF is now supported for Generic rules.
    • When running a custom rule through Rule Discovery, the column names Repo and Column again display correctly.
  • Alerts
    • You can now send emails using unauthenticated SMTP servers.
  • Security
    • Vulnerabilities identified by Jfrog
      • Vulns 0, criticals 0, high severity 7
      • For a visual readout, see the DQ Security Metrics section below.
    • Fixed an issue that allowed jobs to be run from the command line regardless of connection permissions.
      • When Connection Security is enabled, lock the SQL Editor to prevent unauthorized access to other connections. (#87916)
    • Fixed an issue that allowed View Only users to access some profile results and export the data to a CSV file.
      • Added an authorization check for data set access to the profile export feature, which allows only users with data set access to export the profile. (#87720)
    • Backslashes \ are no longer supported characters for AD usernames without disabling XSS for the /v2/updateadsecurityconfiguration API. (#88499)
    • Fixed an issue that prevented navigation back to the log in page when tenant access was denied. (#89024)
  • Profile
    • From the Labels tab, backslashes are now stripped from annotations when they are used for separation within strings.
  • Admin
    • From Audit Trail, when administrators modify roles mapped to data sets or data sets mapped to roles, changes are now documented automatically, and display original and updated values.
    • The Agent Group (H/A) and its associated endpoints are now deprecated.
    • From Usage, you can now access a table and tiles reflective of your monthly usage metrics.
    • Salesforce account ID can now be configured for use with Pendo logs.
    • *Tech Preview* [TP] ServiceNow integration
      • You can now assign incidents (validate action) to ServiceNow groups and users with the following fields included in the same request: caller_id, description, short_description, cmdb_ci.
  • Explorer
    • Fixed an issue with date range on Oracle connections, which caused end date to change to start date when Transform was selected.
    • The Job Estimate modal again displays the correct number of rows for Sybase connections.
    • Fixed an issue with Source to Target where double quotes " were removed from the source file in database to file targets.
  • Scorecards
    • Enhanced the layout of the Assignment Queues page.
  • API
    • v2/getallscheduledjobs is now available as an enhancement of the original, v2getscheduledjobs.
      • A UI integration is planned for a future release.
  • Schedule
    • Added an Active column to the scheduler export.
      • The RunJob column was removed. (#88799)
  • Reporting
    • Fixed an issue that created misalignment of column headers in PDF exports. (#89739)

Known Limitations

  • Rules
    • To use the new SQLF feature for Generic rules, you must manually update the Generic rule type from SQLG to SQLF.
      • A UI feature for this is planned for a future release.
    • Stat rules such as $rowCount do not work for secondary data sets or previous runId of the same data set via @t1 syntax.
      • To work around this limitation, run a subquery to select count(*) from the secondary data set or the previous runId.
  • Explorer
    • Drill-ins and jobs on Sybase connections run successfully, but connections to Sybase with encrypted passwords are currently unsupported.
  • Files
    • When using CSV files, you cannot use a comma , in the name.
  • Admin
    • *Tech Preview* [TP] ServiceNow integration
      • Special characters !@#$%^&*()in the description are not supported and will not persist to the ServiceNow assignment queue at this time.
      • Empty or invalid ServiceNow group name does not return an error in CDQ.
        • As a result, the ServiceNow assignment is generated with the default admin account as the owner if left empty or invalid.
        • You must have a valid ServiceNow group name or its related sys_id.
      • The new REACT UI is not yet supported for the ServiceNow Group integration.

DQ Security Metrics

Warning  There is a critical CVE CVE-2016-1000027 that shows up in the image scan due to Spring version. This is a false positive and should be added to the exception list of the customer scan tools. We don’t use HttpInvokerServiceExporter anywhere in the application and are not impacted by it.

Vulns over time

Criticals table

2022.06

Fixes / Enhancements

  • DQ Job
    • Fixed an issue with the Learning Phase in the Behavior feature. (ticket #82907)
      • Once CDQ has the minimum number of completed successful scans, the learning status now changes to PASSING or BREAKING based on the results.
  • Outliers
    • Fixed an issue where file lookback did not identify expected outliers. (#87967)
  • Alerts
    • When configuring email alerts, SMTP Username and SMTP password fields are still required fields. (#86033)
      • Validation relaxation is planned for the 2022.07 release.
  • Rules
    • Fixed an issue which caused rule breaks to report the opposite of what was defined when a Generic Rule utilizing regex/rlike was created. (#86977)
    • Fixed an issue where Data Classes with Date column types selected did not detect timestamps. (#83000)
    • Fixed an issue where Data Classes using the operators <, > or = caused the inverse rule created from this process to throw exceptions. (#83000)
    • When switching a data class from a regex to expression and then editing again, the regex checkbox is now correctly checked.
  • Agent
    • The Explorer page and Scheduler modal now display the same agents. (#86175)
  • Security
    • Vulnerabilities identified by Jfrog
      • Vulns 0, criticals 0, high severity 8
      • For a visual readout, see the DQ Security Metrics section below.
    • General advisory:
    • Major vulnerabilities related to Spring, ESAPI, and Swagger have been addressed.
    • No cross DB reference is allowed in explorer while accessing SQL database connections.
    • Sensitive UI fields such as username no longer allow autocomplete.
    • If configured, the ENV variable XSS_CANONICALIZE_INPUT_ENABLED should be removed from configmap or owl-env.sh.
    • When dataset security is turned on, you can now add role based authorization for editing existing datasets. (#87720)
    • You can now override the following mail settings from the App Config page within the Configuration section of the Admin Console:
      • "mail.transport.protocol" -- default = smtp
      • "mail.smtp.auth" -- default = true: If true, attempt to authenticate the user using the AUTH command
      • "mail.smtp.auth.login.disable" -- default = false: If true, prevents use of the AUTH LOGIN command
      • "mail.smtp.starttls.enable" -- default = true: If true, enables the use of the STARTTLS command (if supported by the server) to switch the connection to a TLS-protected connection before issuing any login commands.
      • "mail.smtp.ssl.enable" -- default = false: If set to true, use SSL to connect and use the SSL port by default. Defaults to false for the "smtp" protocol and true for the "smtps" protocol.
      • "mail.smtp.ehlo" -- default = true
      • "mail.debug" -- default = true
      • "mail.smtp.ssl.trust" -- default = : If set, and a socket factory hasn't been specified, enables use of a MailSSLSocketFactory. If set to "*", all hosts are trusted. If set to a whitespace separated list of hosts, those hosts are trusted. Otherwise, trust depends on the certificate the server presents. (#76775, 88089)
  • Profile
    • Mean value is now rounded appropriately within the Profile page.
      • For example: The value 2.4334334343345 is now rounded to 2.434.
  • Connections
    • From the Athena driver, you can now use MetadataRetrievalMethod=Query for database queries from the Connection URL. (#86139)
    • Fixed an issue where error messages on failed connections did not display informational text. (#85527)
    • Fixed an issue where NFS file connections under Remote File connections caused jobs to fail. (#88156)
      • Added File protocol for Spark load for NFS file system.
      • Added nfs:// prefix wile adding a NFS connection.
        • This will prepend the URI with the file:// protocol when an NFS file connection is loaded via Spark.
  • Catalog
    • The Graph option is no longer available in Quick links.
  • Admin
    • The Pendo integration is now active by default.
      • No sensitive information is collected; only high-level usage stats are collected.
      • All new customers starting with 2022.06 onward will receive a new license.
      • If you install a standalone environment, modify the <install-dir>/config/owl-env.sh file by adding your license name
        export DQ_INTEGRATION_PENDO_ACCOUNTID=<your-license-name>
      • This new integration will not block or impair the functionality of the app in any way.
      • For more information on Collibra's subprocessors, please review Collibra's Subprocessors page.
    • The Agent Group (H/A) and its associated endpoints are now deprecated. (#83086)
    • Fixed an issue where the "Add Data Category" button was missing without required permissions. (#86625)
    • When a session expires on an Admin page, you are now redirected to the login page.
    • The Admin Limits page now displays informational text indicating that only limits of Tenant - Admin type are displayed on the page.
    • Fixed an issue when editing an existing data category which caused the 'Add new' modal to open instead of the 'Edit' modal. (#89617)
    • From Configuration Settings, DB Limits is now called Data Retention Policy.
  • Explorer
    • You can now view calculated views for SAP Hana when creating a DQ Job on the Explorer page. (#83147, 84328)
    • Fixed an issue which caused the Date range condition to incorrectly display results when using an Oracle connection. (#85802)
    • Fixed an issue which threw an error message when Transform was checked with Date Range condition when using a Postgres connection. (#85802)
    • Fixed an issue where an equals sign = used in a -transform expression from Run CMD caused jobs to fail. (#71547)
    • Fixed an issue where schema and table names containing underscores _ were not accepted.
    • Fixed an issue that allowed jobs to run with a row limit of less than 1.
    • Fixed an issue where incorrect files loaded for preview from BLOB containers with Livy enabled.
    • CLOB data types are unsupported. (#86902)
    • Improved performance and logic when drilling into a database and schema from the Explorer page.
  • API
    • You can now access API quick links page from the Admin Console React page.
    • When using Swagger, UI text now indicates when a field is case sensitive.
  • Reporting
    • *Tech Preview* [TP] Rule Summary page enhancements
      • You can now filter rule breaks by most frequent violations, most severe violations, and least violations.
      • You can now view interactive pie charts with rules and dimension summaries.
  • UI
    • The styling of the expandable legacy navigation pane and the react menu are now updated.
  • Legal

Known Limitations

  • Validate Source
    • When comparing JDBC (target) to remote files such as S3 (source), there is a known Spark bug for "Recursive view detected".
      • This validate source combination is not possible in 2022.06 using Spark 3.2.
    • When using Bigquery as the source, the -libsrc needs to be manually modified to include the core (Spark Bigquery connector) directory.
      • For example, /home/centos/owl/drivers/bigquery**/core**
  • Profile
    • Spark does not currently support varchar data types. All varchar data types are converted to String. Other unsupported data types may also be converted incorrectly.
  • Security
    • Permissions on the Export task have not yet been addressed when dataset security is turned on and you add a role based authorization for editing existing datasets. (#87720)

DQ Security Metrics

Warning There is a critical CVE CVE-2016-1000027 that shows up in the image scan due to Spring version. This is a false positive and should be added to the exception list of the customer scan tools. We don’t use HttpInvokerServiceExporter anywhere in the application and are not impacted by it. There is no fix version available for it from Spring. More details are available at Sonatype vulnerability CVE-2016-1000027 in Spring-web project · Issue #24434 · spring-projects/spring-framework

Vulns over time

Criticals table