Only show release notes of Collibra Platform for Government-certified featuresRelease 2023.06
Release information
- Release date of Collibra 2023.06.0: June 4, 2023
- Upgrade non-production environments: June 4, 2023
- Upgrade production environments: June 25, 2023
- Release date of Collibra 2023.06.1: July 9, 2023
- Release date of Collibra 2023.06.2: July 28, 2023 (GovCloud environments only)
- Release date of Edge 2023.05.1: May 21, 2023
- Release date of Edge 2023.05.2: June 4, 2023
- Release date of Edge 2023.05.3: July 2, 2023
- Release date of Edge 2023.05.4: July 9, 2023
- Release date of Edge 2023.05.5: July 16, 2023
- Release date of Edge 2023.05.6: July 30, 2023
- Relevant Jobserver version: 2023.05.0-55
Highlights
-
The Azure Data Lake Storage (ADLS) file system integration via Edge is now generally available. This integration allows for the registration of ADLS as a data source in Collibra and the synchronization of the metadata. After the synchronization, the files and directories of the ADLS file system are represented in Collibra by specific asset types, retaining the original names.
The integration also supports the ingestion of schema information from Microsoft Purview.
New features
Data Catalog
- The Azure Data Lake Storage (ADLS) file system integration via Edge is now generally available. This integration allows for the registration of ADLS as a data source in Collibra and the synchronization of the metadata. After the synchronization, the files and directories of the ADLS file system are represented in Collibra by specific asset types, retaining the original names.
The integration also supports the ingestion of schema information from Microsoft Purview.
Data Lineage and BI integrations
Note Data Lineage is a cloud-only feature.
- When you create technical lineage for Azure Data Factory, you can filter the factories that the lineage harvester collects and processes by using the new
factoriesproperty in the lineage harvester configuration file to filter. - When you create technical lineage for Snowflake by using the lineage harvester or on Edge, you can now:
- Use schema names as a filter to include lineage for objects only in the specified schemas.
- Limit the number of days of the user access history that CollibraData Lineage collects and processes.
- Specify cross-referenced databases to ensure correct lineage across all databases.
Enhancements
Data Lineage and BI integrations
Note Data Lineage is a cloud-only feature.
- When ingesting DataStage data sources, you can specify the jobs and runtime parameters in the connection definitions for CollibraData Lineage to collect and process.
- You can now enable the technical lineage via Edge features in the Lineage on Edge section. Previously, the features were in the Beta features section.
- When integrating SQL Server Reporting Services (SSRS) or Power BI Report Server (PBRS), the Collibra Data Lineage service instances now support CommandText with SQL that starts with “=“.
- When integrating Power BI, you can now use HTTP1 streams if you are experiencing timeout issues with the default HTTP2 streams. To do so, include the new optional property “useHttp1” in your lineage harvester configuration file, and set the value to “true”.
- When integrating Power BI via the Power BI harvester (integration method v1), which has been deprecated since 2022, the Power BI source code now includes an end-of-life message. Please migrate to Power BI via the lineage harvester (integration method v2) or via Edge, by August 1, 2023.
- When you integrate MicroStrategy via the new integration method (in preview), you can now view the source code for all tables and transformations, in the technical lineage Sources tab page. The source code shows information about the processes visible in the technical lineage and shows warnings and errors where a process has failed. This enhancement does not affect the success rate of metadata analysis.
- When ingesting CSV files as part of a Tableau integration, the “database > schema > table” structure in the technical lineage now matches the structure of the ingested CSV file in Data Catalog. This ensures that stitching can be achieved for CSV files.
- Collibra Data Lineage now supports the Power Query M function Table.CombineColumns:
- When ingesting Spark SQL data sources via the lineage harvester or via Edge, you can now use the externalDbName property to specify the database name.
- When ingesting Snowflake data sources, the Collibra Data Lineage service instances now support the DATA_RETENTION_TIME_IN_DAYS parameter for CREATE TABLE statements.
Data Governance
- You can now manage the status of externally missing resources when using the REST Import API full synchronization endpoints.
-
If you are logged in when your password expires you will now be required to reset your password to continue using the platform.
Browser Extension
Note Browser Extension is a cloud-only feature.
- The Open on Current Page button is now added to the configuration dialog box to enable you to show the Collibra Browser Extension overlay on the page that is currently open.
- When viewing an asset in the Collibra Browser Extension pane, the status of the asset is now shown.
Fixes
Data Catalog
- You can now collect sample data via Edge for child asset types of the Column asset type. (ticket #104500, 111605)
- We now wait longer for responses from the S3 integration API via Jobserver to prevent synchronization failures due to network request rate limits. (ticket #102125)
Data Lineage and BI integrations
Note Data Lineage is a cloud-only feature.
- We have upgraded:
- The Snowflake driver to address the CVE-2023-30535 vulnerability.
- The BigQuery driver to mitigate the CVE-2022-45688 vulnerability.
- When ingesting HiveQL, Collibra Data Lineage now supports Hive extension for the multiple inserts clause. (ticket #111873)
- When ingesting SQL Server Integration Services,
- Previously, Collibra Data Lineage filtered out some queries from being sent to the SQL parser due to legacy limitations. Now, Collibra Data Lineage does not filter out queries being sent to the SQL parser. You might find increased successful lineage as well as increased parsing and analysis errors, as Collibra Data Lineage tries to parse more queries into lineage. The new behavior will be seen when you synchronize the technical lineage for SQL Server Integration Services again. (ticket #105023)
- No SQL Source names are named None now. (ticket #110889)
- When integrating SQL Server Reporting Services (SSRS) or Power BI Report Server (PBRS):
- You no longer get an error if you filter on a folder to which you don’t have access. (ticket #114479)
- You no longer get an error if the “rd” namespace is not specified at the top level of a report (an RDL file). In that case, it is now taken from the child level. (ticket #112568)
- When integrating Tableau, backticks “`” in a query no longer result in missing columns when processing a CREATE TECHLIN VIEW.
- When integrating MicroStrategy, any forms that don't have expressions are now skipped. Previously, Collibra Data Lineage attempted to process such forms, resulting in errors. (ticket #114735)
- When ingesting Amazon Redshift data sources, the Collibra Data Lineage service instances now support the COLLATE function. (ticket #114243)
- When ingesting HiveQL metadata, the Collibra Data Lineage service instances now support the TBLPROPERTIES parameter with an empty list, for Hive CREATE TABLE statements. (ticket #113949)
- When ingesting Spark SQL data sources, the Collibra Data Lineage service instances now support identifiers that start with a number. (ticket #112898)
- When ingesting Snowflake data sources, the Collibra Data Lineage service instances now support aliases in combination with the FLATTEN function, when used in JOINs. (ticket #112812)
- Tableau ingestion via Jobserver/DIC now correctly sets the synchronization status to Successful (ticket #111938)
Data Governance
- Importing multiple relations for the same asset via the REST Import API, in tabular format with over 1,000 rows, no longer causes some of the relations to be created and then deleted during the import operation. (ticket #111540)
- When importing complex relations, you can again see all available complex relation types. (ticket #111949)
- A table header in your input file that matches both an out-of-the-box attribute and a custom attribute no longer causes the import wizard to automatically map the custom attribute in addition to the out-of-the-box one. (ticket #110748)
-
Local System Administrators can once again create or reset their own password using the link provided by Collibra via email. (ticket #114425)
-
Logging into Collibra Console via SSO now works correctly, without throwing a 403 error.
Data Marketplace
- In Data Marketplace, we no longer show a filter value multiple times if the value matches the search criteria multiple times. (ticket #109252, 112998, 114575)
- The issue that caused descriptions with a lot of HTML tags to show less information in the search results has been fixed. We show the beginning of the description and highlight the search term if it appears. (ticket #112994)
Diagrams
- Backend timeouts for diagrams are now improved to prevent problems with long query processing. (ticket #102102)
Search
- Reindexing no longer fails when you manually rebuild the search index on a new instance.
Browser Extension
Note Browser Extension is a cloud-only feature.
- The Collibra Browser Extension icon no longer disappears when clicked on sites such as GitHub and Jira.
- The Collibra Browser Extension pane now dynamically responds as you switch between the Tableau dashboards.
Collibra maintenance updates
Collibra 2023.06.1
- Workflows that start when a user is added, no longer prevent the creation of new SSO users when the start event is asynchronous. (ticket #115184, 117201, 117202, 117209, 117337, 117422, 117568, 117689, 117813)
Note If you have workflows that start when a user is added, ensure that the start event is asynchronous to enable the creation of new SSO users.
Collibra 2023.06.2
- Fixed a cross-site scripting vulnerability.
Edge and Data Lineage updates 
These updates contain security and bug fixes for Data Lineage, Edge sites and their capabilities. These releases may be planned outside the regular monthly or quarterly release. You'll see the fix versions if you are manually upgrading an Edge site or reviewing logs.
Edge 2023.05.1
- We fixed an issue that impacted character encoding for credentials used to authenticate for a component upgrade, which resulted in some existing Edge sites going offline after they were re-installed post 2023.05. With this fix, the character encoding for credentials work as expected, and existing Edge sites can be re-installed successfully. (ticket #113046, 113363, 113957, 114121, 114123, 114149, 114226, 114250, 114314, 114316, 114340, 114364, 114439, 114517, 114651, 114679, 115076)
- We fixed an issue within the Datadog helm chart that caused Edge sites installed before 2022.04 to became unhealthy when they were updated. (ticket #114112, 114226, 114316, 114346)
- We fixed an issue, which prevented the search criteria from listing all relevant Kubernetes resources and resulted in edge-controller and edge-proxy restarting multiple times. With this fix, all Kubernetes resources that fit the search criteria, such as namespace and label, will be returned and the edge-controller and edge-proxy will be not repeatedly restart.
- When you add the Databricks Unity Catalog synchronization capability, you can now include or exclude databases and schemas, and configure domain mappings via the "Filters and Domain Mapping (in preview)" field. This will replace the existing "Exclude Schemas" field in a future release.
- The Tableau and Power BI Edge Capabilities can now use up to 8GB of memory.
Edge 2023.05.2
- When integrating Data Quality & Observability Classic metadata via Edge, you can again ingest data quality rules into Collibra Platform. (ticket #110479, 113774, 114453, 114710, 114980, 115029, 115095, 115140, 115175)
- We have improved the security of Data Classification via Edge.
- When integrating Tableau via Edge, you can now filter on multiple sites, even when one of them is the default site.
- When you configure proxies to create technical lineage via Edge, you can now specify a comma-separated list of proxy servers.
- We applied a fix for a k8s bug that effects EKS Edge sites. This fix removes unused PVs from EKS nodes, which previously accumulated causing some capabilities to fail. The full k8s bug fix will be included in EKS’ next platform release for k8s version 1.24.
Edge 2023.05.3
- We have improved the security of Data Classification via Edge.
Edge 2023.05.4
- We have improved one or more private preview features.
Edge 2023.05.5
- When you create technical lineage for Snowflake on Edge with the SQL-API ingestion method, you can use the displaySampleQueries property in the new Snowflake source ID configuration file to control whether a question mark (?) is displayed in place of certain static values, such as numbers or dates.
- When you create a technical lineage via Edge with the shared storage connection type, there is a no longer a limit to the number of files you can have in the target directory. Previously, Edge loaded only the first 500 files and ignored the rest. (ticket #166907, 117339)
- Databricks Unity Catalog provides a "properties" field for Catalog, Schema, and Table objects that contains a map of arbitrary key-values. You can now ingest the values from the Table properties to specific attributes in the Table asset.
When you add the Databricks Unity Catalog synchronization capability, you can add a JSON string in the "Extensible Properties Mapping (in preview)" field to define the mapping between the "properties" field for Table objects in Databricks and the attribute IDs to ingest the data in. If you use this feature, make sure to set up all required characteristic assignments for the asset type.
This is a feature in preview. - The ADLS integration via Edge now supports no_proxy servers.
- The Databricks Unity Catalog synchronization capability has been updated to resolve the duplicate key error. This error prevented you from combining the integration of Databricks Unity Catalog and the registration of the Databricks data source via the JDBC driver. (ticket #117971, 118135)
Edge 2023.05.6
- We have improved the security of Data Classification via Edge.
- The S3 synchronization capability has been updated to prevent any null pointer exception.