Airflow: Set up OpenLineage integration and prepare files for Cloud Storage connections
Use this procedure to configure your software to emit OpenLineage messages and save the resulting files to a location accessible by Collibra.
- To install and configure the OpenLineage integration in Airflow, follow this guideline in the Airflow documentation: Using OpenLineage integration.
You can use the following configuration as an example:
[openlineage] transport='{"type":"http", "url": "http://HOST_OR_URL_WHERE_FLUENTD_IS:8888/openlineage' namespace = 'airflow' - Copy the files in OpenLineage format to the relevant directory in your cloud-based storage system. The files must be in one of the following:
- An AWS S3 bucket.
- An Azure Data Lake Storage container.
- A Google Cloud Storage bucket.
Note Whenever you synchronize lineage, you must upload all source files you want to include in the technical lineage graph.
You can now: