Download SQL files to the lineage harvester folder

You can download the SQL files of a data source that is stored locally and cannot be accessed via the network. The lineage harvester then stores the data source information in a ZIP file.

To create a technical lineage for these data sources, you only have to include the ID of the data source and the path to the ZIP file in the configuration file.

Note Click here to see a list of all supported data sources.

Prerequisites

Steps

  1. Start the lineage harvester to create an empty lineage harvester configuration file by entering the following command:
    • Windows: .\bin\lineage-harvester.bat
    • For other operating systems: chmod +x bin/lineage-harvester and then bin/lineage-harvester
    An empty configuration file is created in the config folder.
  2. Save the configuration file in the config directory in the lineage harvester folder.
  3. Prepare the configuration file.
    Tip Use the configuration file generator to easily create a configuration file.
  4. When prompted, enter the passwords to connect to Collibra and your data sources. Do one of the following:
    • Enter the passwords in the console.
      The passwords are encrypted and stored in /config/pwd.conf.
    • Provide the passwords via command line.
      The passwords are stored locally and not in your lineage harvester folder.
  5. Start the lineage harvester again and do one of the following:
    • To download the SQL files of all data sources in the configuration file, run the following command:
      ./bin/lineage-harvester load-sources
    • To download the SQL files of specific data sources in the configuration file, run the following command:
      ./bin/lineage-harvester load-sources -s "ID of the data source"
      Tip This command allows you to download specific SQL files in the configuration file, without refreshing other SQL files. This reduces the time you need to download your SQL files, since you only download specific ones without affecting the others. If you want to download SQL files of multiple data sources, add -s "ID of another data source" per data source to the command.
    • The lineage harvester downloads the SQL files of the data sources and stores them in a ZIP file per data source in the lineage harvester output folder.

What's next?

You can now prepare a configuration file for theSQL files of data sources that you want to include in your technical lineage.