Technical lineage general troubleshooting

This topic contains the following information:

Most common issues

The following messages or other issues can appear when you run the lineage harvester, view a technical lineage or upload the new relations to Data Catalog via Collibra Data Lineage.

Tip For a list of all error codes and messages that the lineage harvester displays, please see the lineage harvester error codes section.

Problem

Solution

No suitable driver is found.

Source '<datasource>' failed with exception: java.sql.SQLException: No suitable driver found for jdbc:snowflake://<hostname>

This error occurs when the configuration for the JDBC driver in the lineage harvester configuration file is not correct. For example a trailing slash is added to the host name.

Examine the hostname in the lineage-harvester.conf file and ensure the hostname does not contain any slash (/).

You get the following error message:

Could not find or load main class lineage.lineage-harvester-<version nr.>

This error message appears when the folder path to the lineage harvester is invalid. Check the folder path and make sure that it does not contain whitespaces.

You get the following error message:

Failed to load file '<file-name>'. If the file is not in UTF-8, please convert it accordingly.

This error message appears if the lineage harvester tries to read a non-UTF-8 SQL file of a data source with connection type SqlDirectory. To solve this issue, convert all SQL files to UTF-8 and rerun the lineage harvester.

The lineage harvester does not connect to hosts using a proxy server.

Technical lineage does not support proxy server authentication, but you can connect to a proxy server. For complete details, including the necessary commands, see Connecting to a proxy server.

You get the following error message or a similar certificate error:

Source '<data source name> failed with exception: javax.net.ssl.SSLHandshakeException: General SSLEngine problem

This message appears when the proxy server sends an unexpected certificate to the lineage harvester or when the default Java TrustStore is empty or outdated.

First update Java and rerun the lineage harvester to see if that resolves the issue. If the same error message is shown, try the following:

You get the following error messages:

In the lineage harvester log file:

java.lang.Exception: No native library found for os.name=Linux, os.arch=x86_64, paths=[/org/sqlite/native/Linux/x86_64:/usr/java/packages/
lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib]

In the console:

Failed to load native library:<sqlite-file-name>. osinfo: Linux/x86_64 java.lang.UnsatisfiedLinkError: /tmp/<sqlite-file-name>: failed to map segment from shared object: Operation not permitted

The lineage harvester uses a temporary file containing an SQLite database as a cache file. That means that you need write permission to the /tmp folder.

If this action failed, you can specify another directory with write permissions using -Dorg.sqlite.tmpdir=<path to a temp directory>.

Example You have a temporary directory with write permissions. The path to this directory is custom/temp. Run the lineage harvester with the following command:
./bin/lineage-harvester -Dorg.sqlite.tmpdir=custom/temp full-sync

The lineage harvester configuration file is not specified correctly.

harvester.error.ConfigurationProblem: Bad configuration: .sources[0](expected ',' or '}' got '"')

The lineage harvester configuration file includes JSON syntax errors.

Examine the lineage harvester configuration file and correct any errors. In this error message, there is a missing comma (,) or curly bracket (}).

You get the following error message:

Technical lineage is not enabled for this Catalog instance.

First make sure that there are no spelling errors in the Data Catalog section of the configuration file. If your configuration file is configured correctly, but the issue is not solved, create a support ticket to enable Technical lineage for your Collibra Data Intelligence Cloud instance in Salesforce.

You get the following error message:

The size of the import file is too large (max size: 10 MB).

The file you are trying to upload exceeds the size limit for uploaded files.

Contact Collibra support to increase the maximum file size.

You get the following error message:

Source 'X' was never successfully processed..

This message appears when a source that is specified in the lineage harvester configuration file has never been successfully processed by the Collibra Data Lineage service.

You can either:

  • Remove source 'X' from the configuration file, and then run the command again.
  • Run a full-sync of source X, and then re-run the command that previously failed.

Technical lineage is unavailable because the selected table does not contain columns.

Technical lineage only includes tables that have columns. Add a relation of the type "Table contains/is part of Column" between your Table asset and Column assets.

You get the following message in your technical lineage:

The current asset doesn't have a technical lineage yet.

This message appears if one or more of the following situations apply:

  • The data source of the current asset is not included in the configuration file. If you want a technical lineage for this asset, add its data source to the configuration file.
  • You have upgraded to the lineage harvester 1.3.0 or newer or you created a technical lineage for the first time. In this case, you may need to restart your DGC service before you can see the technical lineage.
  • You see parsing errors. For more information, see the Sources tab page.
  • The full name of one or more relevant assets does not match (including for case-sensitivity) any of the names of the assets in the configuration file, which causes automatic stitching to fail. Make sure that the information in the configuration file and the Data Catalog physical data layer matches:
    • The relevant assets have relations between each other, for example Technology asset groups/is grouped by Technology asset> → <Database asset> contains/is part of <Schema asset> contains/is part of <Table asset> contains/is part of <Column asset>.
    • The full name of your System asset matches the name of your system or the name you used in the configuration file.
    • The full name of your Database asset matches the name of your database or, for Google BigQuery your project, or the name you used in the configuration file.
    • The full name of your Schema asset matches the name of the Schema of the data source or the name you used in the configuration file.
    Tip Make sure that the full path of each asset in Data Catalog matches the full path of the corresponding data object from your data source on the Stitching tab page. Note that in Collibra, full paths are case-sensitive.

You get one of the following messages:

  • Nodes count exceeds the limit 350.
  • Edges count exceeds the limit 1000.

This message appears when the technical lineage graph exceeds the limit of 350 nodes or 1,000 edges, and is too large to build. This happens, for example, if you have a table with many columns and you try to show the technical lineage of all columns in a table in one graph.

If you see this message, we recommend that you browse through the technical lineage graph on the object level or select a single column in the Browse tab pane.

Note You cannot manually change these limits.

You get the following error message in your technical lineage for a Microsoft SQL Server data source: "Oops, no data flow found in your SQL scripts. Make sure you upload DML queries like insert, update, merge that moves data between the tables."

This error message appears when you run the lineage harvester to create a technical lineage for a Microsoft SQL Server data source without having the correct permissions to the SQL Server. As a result, the lineage harvester processes empty files and there is no technical lineage available for this data source.

Make sure you have at least the VIEW DEFINITION permission or sysadmin role in Microsoft SQL Server.

Note If you use multiple users, make sure that each one of them has the proper permissions.

You get the following error message:

net.snowflake.client.jdbc.SnowflakeSQLLoggedException: JDBC driver internal error: Fail to retrieve row count for first arrow chunk: sun.misc.Unsafe or java.nio.DirectByteBuffer.<init>(long, int) not available.

The issue is related to the Arrow library, a dependency of the Snowflake JDBC driver. The issue has not been resolved in the Snowflake JDBC driver; to get rid of the error, set the JAVA_OPTS environment variable when you run the lineage harvester. For example, to process data from all data sources including the Snowflake data sources, take the following steps:

You get the following error message:

Source 'SnowflakeInfo' failed with exception: net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error:

Database 'SNOWFLAKE' does not exist or not authorized.

To access the Snowflake shared ready-only database, you need a user with a role that has the IMPORTED PRIVILEGES privilege.

If the privilege is not assigned to the default role in Snowflake, you can use the customConnectionProperties property in the lineage harvester configuration file to assign the appropriate role to the user. For example:
"customConnectionProperties": "role=METADATA"

The import job fails.

Note If the import job fails during import and the failing job is rolled back, you can have both old and new relations. The old relations were created during the first job and the new relations are created after the rollback. If more than one job is triggered, only the failed job is rolled back.

First, check the following:

  • The asset ID must exist.
  • The structure of the data must be correct.
  • The cardinality of relation types between asset types.

Then, rerun the import of relations.

Relations are not changed as expected.

Check whether the lineage harvester refreshed the data source via a scheduled job. If the import job failed, then the data source was not refreshed and the previously created relations stay the same. If that happened, rerun the lineage harvester to import again.

Manual relations are overwritten.

We recommend that you do not manually add relations of the type "Data Element targets / sources Data Element" between asset types that are imported via the scheduled jobs. These relations are overwritten every time the scheduled job synchronizes the data source.

Ingesting Looker or Power BI assets fails.

For more information, see the following sections:

You get the following error message:

java.lang.OutOfMemoryError: Java heap space

This error message indicates that Java does not have enough memory allocated to finish the task. This error can happen anytime during Harvester run. Follow these steps to increase the maximum heap size.

Note 4 GB RAM is sufficient in most cases, but more memory could be needed for larger harvesting tasks.

You get the following error message:

java.lang.NoSuch.MethodError

This error message indicates that the JAVA_HOME was not specified; therefore, the harvester was using a previous version of Java. With the following commands, you can specify the Java version to 11, which is needed to successfully run the lineage harvester:

  • export JAVA_HOME=/us
  • echo $JAVA_HOME

You get the following error message:

Error: A JNI error has occurred, please check your installation and try again

Exception in thread "main" java.lang.UnsupportedClassVersionError: harvester/Harvester has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0 ...

In this error message, class file versions up to 52.0 indicates that Java 8 was used; however, lineage harvester requires Java version 11 or newer.

If there are multiple versions of Java installed, the lineage harvester might pick Java 8 instead of Java 11. You can run the command java -version to check your Java version.

To resolve this issue, set the path to the correct Java installation directory, in the JAVA_HOME environment variable. To do so, run the lineage harvester with the following command:

On Windows:
set JAVA_HOME=\path\to\java_11_dir && .\bin\lineage-harvester.bat full-sync

Note The set command is specific to the Windows Command Shell. The command is different if you are using PowerShell.

On Linux:
JAVA_HOME=/path/to/java_11_dir ./bin/lineage-harvester full-sync

You get a NegativeArraySizeException error.

A NegativeArraySizeException error is shown if your Java Virtual Machine (JVM) has the string compaction feature disabled. In that case, calling the getBytes results in an attempt to allocate triple the size of the string's value, which can exceed the size limit.

To resolve this error, try running the lineage harvester with the string compaction feature enabled, by running the following command:

JAVA_OPTS='-XX:+CompactStrings -Xmx8g' ./lineage-harvester full-sync -s <source ID>

Synchronization of a data source fails completely or with some errors.

A synchronization job that is completed without any errors has the Success status. Other possible statuses, Completed With Error, Aborted and Failure, are determined in part by the value of the "Number of failed commands before stopping import job" setting, in Collibra Console.

For complete information, see Synchronization: Continue on error option.

You can view the results of a synchronization job in the Activities list.

In the Activities list, click Results in the relevant row to view the details of a synchronization job. The details are intended to help you resolve the errors. To help reduce the chance of an aborted synchronization job, consider increasing the value of the "Number of failed commands before stopping import job" setting.

The technical lineage viewer does not show the technical lineage.

If a technical lineage graph does not show up, change the technical lineage graph details on the tool bar to Objects.

For more information, go to Technical lineage viewer.

Testing connectivity

You can check whether the lineage harvester can connect to the Collibra Data Lineage service instance and Data Catalog.

  1. Run the lineage harvester in command line.
  2. Run the following command: test-connection.
  3. The result shows if the lineage harvester can connect to the Collibra Data Lineage service instance and Data Catalog.

The logs will also show the IP addresses of the Collibra Data Lineage service instances that you have to allow.

Password errors

If you mistyped the password or want to change an existing password, go to the lineage harvester folder > config/pwd.conf and delete the lines below. As a result, the lineage harvester will ask for the password again.

Tip If you have the lineage harvester version 1.3.0 or newer, you can also provide your passwords via stdin or a password manager.

{
	"url" : "<URL>",
	"userName" : "<user>",
	"password" : "<password>"
}