Prepare Teradata SQL files for shared storage

To create technical lineage for JDBC data sources by using the Shared Storage Connection, you must provide SQL files that include your SQL queries. CollibraData Lineage processes the metadata based on your queries to create the technical lineage.

Prepare the SQL files and store them in your cloud-based storage system. The files must be in one of the following:

Steps

  1. Create your SQL files. Ensure that the following requirements are met for the SQL files:

    • The SQL files must be UTF-8 encoded.
    • The SQL files can't have white spaces in their names.
    • For better ingestion, include one SQL statement in one SQL file.
    • SQL files that contain Data Definition Language (DDL) statements must be processed before SQL files that contain Data Manipulation Language (DML) statements.
      For the data sources that are listed in Supported SQL statements , Collibra Data lineage automatically detects DDL statements, regardless of the SQL file names.
      For other JDBC data sources, Collibra Data Lineage processes SQL files in alphanumeric order. To ensure that DDL statements are processed first, name SQL files that contain DDL statements so they are before files that contain DML statements.
    • The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. For more information, go to Add a technical lineage capability to your Edge site and Automatic stitching for technical lineage.
    • For Collibra Data Lineage to correctly highlight the transformation logic in the Source code pane, we strongly recommend that your SQL files have Unix line endings. Non-Unix line endings, for example Carriage Return (CR) and Line Feed (LF) line breaks, do not influence the extracted lineage and can result in incorrect highlighting.

    For more information, go to Supported SQL syntax.

  2. Store the SQL files in the folder that you created when you created the Shared Storage connection.
  3. Store the SQL files in your cloud-based storage system. The files must be in one of the following:
    • An AWS S3 bucket.
    • An Azure Data Lake Storage container.
    • A Google Cloud Storage bucket.

Example 1 SQL statements do not include schema and database names

This example shows the SQL files that include the queries on the Persons and JobInformation tables and the JobTitleView view. The SQL statements don't contain the database and schema values, so Collibra Data Lineage uses the values of the Database and Schema fields in the technical lineage for SqlDirectory capability for stitching. The SQL files are named in a way that ensures the DDL statements are processed before the DML statement.

Example 2 SQL statements include schema and database names

This example shows SQL files that include the queries on the Persons and JobInformation tables and the JobTitleView view. The SQL statements contain the database and schema names for each table and view, and Collibra Data Lineage uses them for stitching. The SQL files are named in a way that ensures the DDL statements are processed before the DML statement.