Connecting to a proxy server

Warning The CLI lineage harvester is now deprecated and will officially reach its end-of-life on July 31, 2026. To ensure a smooth transition, we encourage you to begin creating technical lineage via Edge, if you haven't already.

Collibra Data Lineage supports proxy server connection and authentication. You can use the following parameters to connect to a proxy server.

On Windows

  1. Set the -D parameter to the JAVA_OPTS environment variable.
    Example 
    Copy
    set JAVA_OPTS=-Dhttps.proxyHost="azusquid.imf.org" -Dhttps.proxyPort="8080" -Dhttps.proxyUser="myusername" -Dhttps.proxyPassword="mypassword"
  2. Run the lineage harvester in the same command line window: .\bin\lineage-harvester.bat

On other operating systems

  1. To access the hosts via a proxy server, run the following command:
    Copy
    bin/lineage-harvester -Dhttps.proxyHost=<Hostname or IP address of  the proxy> -Dhttps.proxyPort=<port number> -Dhttps.proxyUser=<username> -Dhttps.proxyPassword= <password> full-sync
    Example If you want to use a proxy with hostname proxy.example.com and port number 443, run the following command:
    Copy
    bin/lineage-harvester -Dhttps.proxyHost=proxy.example.com -Dhttps.proxyPort=443 Dhttps.proxyUser=myusername -Dhttps.proxyPassword=mypassword
  2. To exclude hosts that should be accessed without going through the proxy server, add the following parameter:
    Copy
    -Dhttp.nonProxyHosts=<host to exclude>
  3. You can exclude multiple hosts by using the pipe character (|) to separate the hostnames or IP addresses to exclude. You can also use an asterisk (*) as a wildcard to match multiple hostnames or IP addresses.

    Example If you want to exclude hosts with hostname localhost and hosts with IP address 127.0.0.1 and all IP addresses starting with 192.168*, run the following command:
    Copy
    bin/lineage-harvester -Dhttps.proxyHost=proxy.example.com -Dhttps.proxyPort=443 -Dhttp.nonProxyHosts=localhost|127.0.0.1|192.168*

In your configuration file, the value of the source url or hostname property (depending on the data source), and the value in your -Dhttp.nonProxyHosts parameter, as described above, must both be either an IP address or a host name. An error is generated if, for example, you have a host name in the hostname property and an IP address in the -Dhttp.nonProxyHosts parameter.