| 2023.01 |
|
| 2022.11 |
-
When you integrate Power BI:
- Inactive workspaces and personal workspaces are no longer ingested.
- Filtering is improved. You can now use the optional properties
excludeWorkspaceNames and excludeWorkspaceIds to exclude specified workspaces. Before configuring your filters, ensure that you read all about the advantages, limitations and configuration considerations in Power BI workspaces. - The ownership information (admin and creator email addresses only) for reports is now ingested in Collibra. The "Owner in source" attribute is included on Power BI Report asset pages.
- The email addresses of all admins and creators of Power BI data models and workspaces are now ingested. Previously only a single email address was ingested, even if there were multiple admins or creators of the data object in Power BI.
- When you ingest Snowflake data sources, the
databaseNames property is now correctly taken into consideration.
- When you integrate Tableau:
- Previously, when you filtered on a site, a Tableau Site asset was created in Collibra, but no metadata was ingested. Now, when you filter on a site, all metadata in the site is ingested in the specified domain. If, however, a site is specified in the lineage harvester configuration file, but not in the
filters and domainMapping properties in the Tableau <source ID> configuration file, the metadata is ingested in the default domain. - You can now use wildcards in the
filters property in the Tableau <source ID> configuration file. Also, the filters property is no longer case-sensitive. - You can now ingest sites that don't have workbooks.
- Ownership information (email addresses only) for projects, data models, workbooks and dashboard is now ingested in Collibra. The Owner in source attribute is included on Tableau Project, Tableau Data Model, Tableau Workbook and Tableau Dashboard asset pages.
- When you ingest Informatica PowerCenter data sources, the lineage harvester now correctly processes session mapplets. Previously, this failed with error message "'NoneType' object has no attribute 'lower'".
- When you ingest Informatica Intelligent Cloud Services data sources and the
useCollibraSystemNames property is set to true, databases are now shown in the Technical lineage Browse tab pane with the specified system name or as "UNDEFINED”, if a database could not be mapped to a system name. If set to false, then all databases are now shown directly under the DATABASE node.
- When you ingest metadata from Oracle data sources, you can now add a new
DatabaseOracle section in your lineage harvester configuration file, to specify the Oracle database name and ensure stitching without any workarounds.
- If you integrate SSRS-PBRS and use a <source ID> configuration file, the
CustomDataSource section in the <source ID> configuration file is no longer mandatory.
- The lineage harvester now uses Looker 4.0 APIs, with paging options.
|
| 2022.10 |
-
The lineage harvester now supports the following IBM DB2 constructs: PREVVAL FOR <sequence>, PREVIOUS VALUE FRO <sequence>, NEXTVAL FOR <sequence> and NEXT VALUE FOR <sequence>.
- You can now use the new optional "deleteRawMetadataAfterProcessing" property in your lineage harvester configuration file. With this property, you can delete your raw metadata from the Collibra Data Lineage service after processing. This property is applicable for all supported data sources.
- When you specify a Data Catalog URL in the lineage harvester configuration file, it no longer matters whether you include a trailing slash (/) in the URL.
- The Collibra Data Lineage service now supports the following transformations: Table.FromRecords and Table.IsEmpty.
- Collibra Data Lineage now supports key-pair authentication when ingesting Snowflake data sources.
- The PostgreSQL JDBC Driver is upgraded to version 42.4.1.
- The Collibra Data Lineage service can now compute indirect lineage from set queries, which are queries with the UNION keyword with the ORDER BY clause.
- When you integrate Power BI, the lineage harvester is now more resilient to OutOfMemory errors.
- When you integrate Tableau and filter on a sub-project, the metadata of the parent project is no longer ingested in Collibra. However, the parent Tableau Project asset is created in the default domain, to preserve the hierarchy required for stitching.
- Looker integration no longer fails if the "collibraSystemName" property is not included in the lineage harvester configuration file. If you want to specify the system name of a database in Looker, use the "collibraSystemName" property in the Looker source ID configuration file. If you don't specify a system name in the source ID configuration file, the system name in the technical lineage graph will be Default.
- In the case of a lookup procedure when ingesting Informatica Intelligent Cloud Services data sources, if the CONNECTIONSUBTYPE parameter is empty, the Collibra Data Lineage service now looks to the CONNECTIONREFERENCE parameter for the name. If that is also empty, then the name in the VARIABLE parameter is used. The ensures the correct detection of the SQL dialect.
- Fixed an issue related to dialect extraction when ingesting Informatica Intelligent Cloud Services data sources.
|
| 2022.09 |
-
Previously, when you created a technical lineage for Power BI, SQL Server Reporting Services (SSRS) or Power BI Report Server (PBRS), the nodes in the technical lineage graph had a gray background, even if the data objects from your data source were stitched to assets in Data Catalog. Data objects now have the intended yellow background when creating a technical lineage for Power BI, SSRS or PBRS. We introduced this enhancement for Tableau and Looker in Collibra 2022.07.
-
When you integrate Tableau, for every Tableau Workbook that you have permission to ingest, all Tableau Dashboards in the Workbooks are now correctly shown in the technical lineage graph. If you do not have permission on the Workbook or Dashboard level, the metadata of these data objects is not ingested.
- When integrating Power BI, the ownership information (email address only) for reports is now ingested in Collibra. The new Owner in source attribute is included on Power BI Report asset pages.
- The lineage harvester now uses Looker 4.0 APIs, with paging options.
- When you integrate Power BI, the lineage harvester is now more resilient against OutOfMemory errors.
- When you integrate Tableau and use domain mapping, subprojects are now ingested in the domains of their parent projects.
-
The Collibra Data Lineage service instances now benefit from the following parsing enhancements when integrating Snowflake data sources:
-
Support for the COLLATE keyword.
-
Support for EXTERNAL TABLE syntax.
- When integrating Power BI, the descriptions of Data Set Tables and Data Set Columns in Power BI are now harvested.
- Fixed an issue that was resulting in a processing error when a column referenced in an ORDER BY clause references a repeated column in the SELECT column list.
- When integrating Tableau, you can now ingest sub-projects for which you have permission to ingest, even if you don’t have permission to ingest the parent projects.
|
| 2022.08 |
- Previously, when you created a technical lineage for a supported BI tool, the nodes in the technical lineage graph had a gray background, even if the data objects from your data source were stitched to assets in Data Catalog. Data objects now have the intended yellow background when creating a technical lineage for Power BI. This enhancement was introduced for Tableau or Looker in Collibra 2022.07. Soon, the enhancement will also apply to SSRS and PBRS.
- When synchronizing Tableau, the synchronization no longer fails if two data sources in the same project with the same name are returned from the Tableau API. The assets of both data sources are now synchronized in Collibra.
- You can now filter on the Tableau project level.
- When integrating Power BI, you can now ingest measures and show them in the technical lineage. Measures are included as the value in the Role in Report attribute on Power BI Column asset pages.
- When attempting to integrate Power BI with invalid Power BI credentials, the lineage harvester log file now provides a more helpful error message.
- When you specify the Power BI workspaces for ingestion, the filters are not case sensitive now.
- When integrating Looker, the ownership information (email address only) for folders, Looks and dashboards is now ingested in Collibra. The new Owner in source attribute is included on Looker Folder, Looker Look and Looker Dashboard asset pages.
- When integrating Power BI, the ownership information (email address only) for data sets and workspaces is now ingested in Collibra. The new Owner in source attribute is included on Power BI Data Model and Power BI Workspace asset pages.
- The lineage harvester log file now identifies whether you are using Tableau Online or Tableau Server, and the version of your Tableau environment.
|
| 2022.07 |
- The lineage harvester now retries to get a batch status again if the first HTTP call failed due to a network error.
- Fixed an issue that was causing custom SQL queries to be identified as belonging to two different Tableau data sources. This resulted in a "Unique constraint failed" error.
- Fixed an issue that was resulting in the No asset matches the specified criteria error.
- When the lineage harvester fetches an access key for a data store, only active records are now fetched. Inactive records are ignored.
- The lineage harvester is more resilient against authorization expiration when ingesting Looker metadata.
- The lineage harvester log file now includes the following information:
- Your Tableau environment type: Tableau Online or Tableau Server type
- The version of your Tableau environment
|
| 2022.06 |
-
When synchronizing Power BI, the last sync time is now correctly shown in the Sources tab page.
-
Fixed an issue that was causing the processing of harvested metadata batches to run without coming to completion.
-
When ingesting Power BI, if there are Oracle data sources, the Oracle service name is now used, instead of the database name.
-
When processing Tableau metadata, the Collibra Data Lineage servers no longer replace ">>" by "<}", which was resulting in parsing errors.
-
Fixed an [SQLITE_ERROR] issue that was breaking the technical lineage when attempting to synchronize a data source.
-
When processing Power BI metadata, SQL statements are now in upper case.
-
When creating a technical lineage for Tableau, any unnecessary brackets “][“ in the names of schemas are now removed.
-
When integrating Power BI, you can now ingest measures without DAX. They are shown as attribute type Role in Report on Power BI Column asset pages.
|
| 2022.05 |
Warning The lineage harvester 2022.05 includes an internal format change to the password manager pwd.conf file. This means that if you use Lineage harvester 2022.05, you can no longer use the pwd.conf file with an older harvester.
-
You can now integrate Power BI in Data Catalog via the lineage harvester, meaning you no longer need to use the Power BI harvester. Additional benefits include the following:
-
Support for Power BI Data Flows.
-
Descriptions of Power BI Reports.
-
Statuses of Power BI Workspaces.
-
Filtering and domain mapping.
Note The new Power BI integration method is specifically for new integrations. For those who have been ingesting Power BI via the Power BI harvester, we will soon release a migration script.
-
Collibra Data Lineage now also supports the following BI integrations:
-
MicroStrategy
-
SQL Server Reporting Services and Power BI Report Server.
-
You can now use token-based authentication when creating a technical lineage for Matillion.
Warning This enhancement is not backwards compatible. You must update your configuration file.If you use the lineage harvester 2022.05, you can no longer use the pwd.conf file with an older harvester.
- The
useCollibraSystemName property is now solely used for the configuration of the system name.
- If you set the
useCollibraSystemName property to true in your lineage harvester configuration file, but don't define the system name in the Tableau <source ID> configuration file, the system name in the Tableau technical lineage shows DEFAULT as the system name.
-
If using a Tableau <source ID> configuration file:
-
You can now use wildcards throughout the file.
-
The hostName and connectorUrl properties are no longer case-sensitive.
- The PostgreSQL JDBC driver is now upgraded from from 42.3.2 to 42.3.3.
- The Apache Hive JDBC driver is now upgraded from 2.6.17.1020 to 2.6.19.2022.
- The lineage harvester no longer hangs when harvesting metadata from certain data sources.
- The lineage harvester automatically refreshes Tableau tokens.
- You can now use the optional
concurrencyLevel property in the lineage harvester configuration file, to specify the internal sizing, meaning the amount of tasks that can be executed at the same time.
|
| 2022.04 |
-
You can now use the
databaseMapping property in your Tableau <source ID> configuration file, to map a Tableau technical database name to the real database name.
- When providing connection definitions for Informatica PowerCenter, the
dbname property is no longer case-sensitive.
- When integrating Informatica PowerCenter data sources, Collibra Data Lineage now correctly creates a technical lineage when
useCollibraSystemName is set to true.
|
| 2022.03 |
- By default, the lineage harvester no longer harvests images. If you want to include images, include the optional
excludeImages property in your configuration file and set the value to false.
- When ingesting Tableau metadata, you can now leave empty the
collibraSystemName property in your configuration file, even if the useCollibraSystemName property is set to true.
- The lineage harvester now correctly shows the help overview when you run the
--help command.
-
Hive source now skips harvesting DDL of exclusively locked tables.
- When you change the domain reference ID in the lineage harvester configuration file, Tableau assets are now successfully deleted from the previous domain and recreated in the new domain.
-
You no longer see a Fiber Failed error while running the lineage harvester.
-
Protobuf is upgraded to version 3.19.3.
-
Fixed an issue that was causing incomplete technical lineage and stitching issues when using custom SQL in Tableau.
-
Fixed an issue that resulted in a TableauHarvesterError when ingesting Tableau metadata via the linage harvester.
- Fixed a NullPointerException when no column data type is harvested.
- Fixed an issue that was causing the ingestion of Looker metadata to fail.
- Fixed an issue that was causing a JsonParseError when ingesting Tableau metadata.
|
| 2022.02 |
Click here for the list of general changes.
-
The Hive JDBC driver is upgraded to AmazonHiveJDBC42-2.6.17.1020.
-
Netty libraries are upgraded to version 4.1.72.
-
System name added to creation of SQL sub-batch.
-
When ingesting metadata from Microsoft SQL Server data sources, the dash character “-“ in database names no longer causes ingestion to fail.
-
Upgraded JDBC drivers for MySQL, PostgreSQL, Teradata, Snowflake, HiveQL, Spark SQL and Microsoft SQL Server data sources. Discontinued support for Active directory authentication for Azure data sources.
-
Applied the "log4j2.formatMsgNoLookups=true" system property to the lineage harvester, as a mitigation step for CVE-2021-44228.
-
The lineage harvester now correctly handles the * EXCEPT syntax for SQL scripts in BigQuery.
-
The lineage harvester can now harvest parameter files in IICS data sources.
-
The lineage harvester now renews the Looker API token if harvesting takes longer than one hour, to avoid an HTTP 401 Unauthorized error.
-
When metadata from Snowflake data sources are analyzed, schema names are no longer wrapped in double-quotes.
-
Fixed computational inefficiency in SQL scanner in case of multiple nested subSELECTs with wildcards.
-
Support added for ALTER TABLE RENAME TO statement in Postgres.
-
Support for Oracle package specifications split into multiple source files (e.g. package definition in one source file and package body in another source file).
Click here for the list of parsing enhancements for various data source types.
-
Microsoft SQL Server: all variants of hexadecimal literals.
-
BigQuery: PARTITION BY clause in CREATE TABLE statements.
-
Oracle:
-
ENCRYPT algorithm in CREATE TABLE column definition.
-
ON OVERFLOW clause of LISTAGG function.
-
Redshift:
-
Optional enclosing brackets [] for table references.
-
DELETE queries that have WITH statements.
-
SQL Server Integration Services: components of type "Microsoft.ScriptComponentHost", which is a subtype of the "Microsoft.ManagedComponentHost".
-
HiveQL:
-
Support for table references starting with numerical digits."
-
Support for "pivot" as a table alias.
-
Support for digit-starting column references
-
Support for digit-starting aliases
-
"default" allowed as schema name
-
Support for grouping sets
-
Better support for parsing "array" and "map" data types
-
Support for parsing "struct" data types
-
IBM DB2:
-
Support for the STRIP function.
-
Support for ¬=, ¬>, ¬<, !> and !< operators.
-
Support for special registers, for example CURRENT SQLID and CURRENT SERVER.
-
Support for CCSID clauses in CREATE TABLE statements.
-
Support for APPEND clauses in CREATE TABLE statements.
-
Support for VOLATILE clauses in CREATE TABLE statements.
-
Support for DATA CAPTURE clauses in CREATE TABLE statements.
-
Support for AUDIT clauses in CREATE TABLE statements.
-
Fix CASE expression.
-
Some keywords allowed as column names and column aliases.
Click here for the list of Tableau-specific changes.
-
The lineage harvester can now connect to Tableau Server or Tableau Online and ingest its metadata.
-
Minor Tableau API improvements, including a fix for an issue that affected databases and tables.
-
Upgraded JDBC drivers for MySQL, PostgreSQL, Teradata, Snowflake, Hive/Spark and Microsoft SQL.
-
Dropped support for Active directory authentication for Azure sources.
-
The new Tableau integration via lineage harvester supports the Tableau Explorer (can publish) role.
-
Applied the "log4j2.formatMsgNoLookups=true" system property to the harvester as a mitigation step for CVE-2021-44228.
-
You can now define custom pagination settings to help avoid node limit errors.
|
| 1.4.4 |
The lineage harvester now supports:
- Technical lineage for Matillion. Redshift and Snowflake projects in Matillion are supported.
- Snowflake syntax for the CONNECT BY clause.
|