Delete the technical lineage of a data source

You can delete the technical lineage of a data source if you no longer want to see it in the technical lineage graph. To delete the technical lineage of the data source, you must remove the configuration of the data source from the lineage harvester configuration file and use the ignore-source command to exclude the data source when you synchronize the technical lineage again.

Note  You always need at least one source in your lineage harvester configuration file.

Before you begin

Install the lineage harvester 2023.04 or newer.

Steps

  1. Optional: To determine the data source that you want to exclude from the Technical lineage, enter the list-sources command:

    • For Windows: .\bin\lineage-harvester.bat list-sources
    • For other operating systems: ./bin/lineage-harvester list-sources
    All data sources that were used to create the technical lineage are listed. The list also includes the source ID of each data source. You can use the list to identify the data source to be excluded.
  2. In the lineage harvester folder, open your lineage harvester configuration file.
  3. Delete the section with connection properties of the data source.
  4. Save the configuration file.
  5. Start the lineage harvester in the console and run the following command to ignore the data source:
    • For Windows: .\bin\lineage-harvester.bat ignore-source <source_ID>, where <source_id> is the ID of the data source that you want to ignore.
    • For other operating systems: ./bin/lineage-harvester ignore-source <source_ID>, where <source_id> is the ID of the data source that you want to ignore.
    The data source is excluded from the list of data sources that are used to create the technical lineage.
  6. Synchronize the technical lineage by running any of the following commands:
    • The sync command:
      • For Windows: .\bin\lineage-harvester.bat sync
      • For other operating systems: ./bin/lineage-harvester sync
    • The full-sync command:
      • For Windows: .\bin\lineage-harvester.bat full-sync
      • For other operating systems: ./bin/lineage-harvester full-sync

    For more information, go to Lineage harvesting app command options and arguments.

  7. When prompted, enter the password to connect to your Collibra Data Intelligence Cloud and data sources in the configuration file.
  8. The lineage harvester uploads the metadata of the remaining data sources in the configuration file to the Collibra Data Lineage service.
    The Collibra Data Lineage service synchronizes the technical lineage and removes the deleted data source from the technical lineage graph.

What's next

You can view the technical lineage. For more information, go to Technical lineage viewer.

You can check the progress of the technical lineage creation in Activities in your Collibra Data Intelligence Cloud environment. The Results field indicates how many relations were imported into Data Catalog. Go to the status page to see the log files of the SQL analysis.

If the lineage harvester log shows an error message or the harvesting process fails, you can use the technical lineage common errors and issues in Collibra Support Portal to fix the error.