Add an Edge capability to an Edge site

Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

After you have created and installed an Edge site, you can add an Edge capability to perform specific tasks on a data source. For example, you can register a data source by using a JDBC connection that belongs to an Edge capability.

Prerequisites

Steps

Tip 

The information in this section varies depending on the capability template that you select.

Select a data source and the connection type if needed to see the related information.

Currently, the information is shown for:

  1. Open an Edge site.
    1. On the main toolbar, click Products icon, and then click Cogwheel icon Settings.
      The Collibra settings page opens.
    2. In the tab pane, click Edge.
      The Sites tab opens and shows a table with an overview of the Edge sites.
    3. In the table, click the name of the Edge site whose status is Healthy.
      The Edge site page opens.
  2. In the Capabilities section, click Add capability.
    The Add capability page is shown.
  3. Select the EdgeCatalog Data ClassificationCatalog JDBC ingestionJDBC ProfilingCatalog JDBC SamplingS3 synchronizationGCS synchronizationDatabricks Unity Catalog synchronizationCatalog JDBC ingestionTechnical Lineage for dbtCollibra Protect for AWS Lake FormationCollibra Protect for BigQueryCollibra Protect for DatabricksCollibra Protect for Snowflake capability template you want to use.
    Note When you select a capability template, you may need to add required custom properties. For example, if you select the S3 synchronization capability template, you have to add credentials to configure the S3 connection.
  4. Enter the required information.
    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    S3 synchronization

    Yes

    S3 service account

    This section contains information about how to connect to Amazon S3.
    AWS Connection
    The AWS connection to be used.

    Yes

    IAM role
    The IAM role to be used by the AWS Glue crawlers.

    Yes

    Delete Glue database left after previous synchronization of the file system

    Select the checkbox if you want the capability to delete the Glue databases created by previous runs of the capability, before the capability starts the synchronization.
    If you deselect this checkbox, the Glue databases created by previous runs are not removed. This can be useful for troubleshooting.

    By default, this checkbox is selected.

    No

    Save input metadata

    Select the checkbox if you want to save the input metadata extracted from the data source in ZIP files. The files can be useful for troubleshooting.
    Select this option only on request of Collibra Support. The Collibra Support team can provide the location of the saved ZIP files after the S3 synchronization.

    By default, this checkbox is not selected.

    No

    Finalization Strategy

    Define what you want to do if an asset has been deleted from the S3 data source after an initial synchronization.
    The possible values are:

    • Change Status (default): If an asset has been deleted from the S3 data source after an initial synchronization, we update the status of the asset in Collibra to "Missing from source".
    • Remove Resources: If an asset has been deleted from the S3 data source after an initial synchronization, we remove the asset from Collibra.
    • Ignore: If an asset has been deleted from the S3 data source after an initial synchronization, we don't change anything for the asset in Collibra.

    Yes

    Logging parameter

    You can use this field to customize the debug logging.

    Important Only complete this field on request of or together with Collibra Support.

    No

    Custom parameter

    Use this field to define that you want to ingest File Group assets as File assets.

    • Name: file-group-as-file
    • Type: Text
    • Encryption: Not encrypted (Plain text)
    • Value: true
    • Type: Text
    • Value Type: Plaintext
    • Name: file-group-as-file
    • Value: true

    No

    Glue database configuration

    Glue database configuration

    Text in JSON format to define the Glue database names, regions, and domain IDs that you want to integrate.

    Tip  Use this parameter if the current S3 synchronization crawler configuration doesn’t meet your needs. With this parameter, you can integrate an AWS Glue database for which you defined crawlers in AWS Glue itself. This allows you to use all crawler options from the AWS Glue Console. If you use this parameter, you don't need to create crawlers in Collibra.

    Important  If you use this parameter, any crawlers you create in Collibra will not be taken into account during the S3 synchronization. You, however, will need to create a dummy crawler in Collibra to start the synchronization. A dummy crawler is a crawler with an invalid include path, such as s3://dummy.
    In a future release, we'll remove the need for a dummy crawler.

    • The text must be in JSON format and can contain a block per database that you want to integrate.
      You can use any JSON validator to verify the format. Collibra is not responsible for the privacy, confidentiality, or protection of the data you submit to such JSON validators, and has no liability for such use.
    • In a block, you can specify the Glue database name, region, and domain ID that must be ingested. The format is:
      • "glueDbName": “the name of the AWS Glue database”
      • "glueDbRegion": “the region of the AWS Glue database”
      • "dgcDomainId": “the domain ID in Collibra where assets of the AWS Glue database must be added”
        If you don't add the domain ID, the assets are added in the same domain as the S3 File System asset.

    Example 
    [
    	{
    		"glueDbName": "integrations-auto-1",
    		"glueDbRegion": "eu-west-1",
    		"dgcDomainId": "a3fe0607-65af-43d6-bc2c-7c3adae6e162"
    	},
    	{
    		"glueDbName": "integrations-auto-2",
    		"glueDbRegion": "eu-west-1"
    	}
    ]

    In this example:

    • Assets from the AWS Glue database "integrations-auto-1" will be ingested into the domain with ID "a3fe0607-65af-43d6-bc2c-7c3adae6e162".
    • Assets from the AWS Glue database "integrations-auto-2" will be ingested into the same domain as the S3 File System asset.


    No

    Advanced Configuration
    • Logging configuration
    • Memory
    • JVM arguments

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This setting is not valid for this integration. It should be set to false.

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    An option to determine the verbosity of the log files. The default value is No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Catalog Data Classification

    Yes

    Connection

    This section contains information to connect to the data source.

    JDBC connection

    The connection to the data source.

    Yes

    General

    This section contains general information about logging.

    Debug

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Databricks Unity Catalog synchronization

    Yes

    Databricks Connection

     
    Databricks Connection
    The Databricks connection to be used.

    Yes

    Configuration

    This section contains information on how to connect to Databricks Unity Catalog. 
    Save input metadata
    If you select this option the metadata extracted from the data source will be saved in a file that can be used for troubleshooting. Select this option only on request of Collibra Support.

    No

    Exclude Schemas (will be removed soon, use domain mapping instead)

    Comma-separated list of the schemas that you don't want to integrate.

    No

    (deprecated) Filters and Domain Mapping

    Important This field is deprecated in the latest UI. You can now define the mappings in the integration configuration.
    If you have existing mappings here, they will continue to work. However, we advise you to move them to the integration configuration.

    Text in JSON format to include or exclude databases and schemas, and to configure domain mappings.

    • The text must be in JSON format and can contain an include and an exclude block. You can use any JSON validator to verify the format. Collibra is not responsible for the privacy, confidentiality, or protection of the data you submit to such JSON validators, and has no liability for such use.
    • In the include block, you can specify the domain in which specific catalogs or schemas must be ingested. The format is: “Catalog/Database > schema ”: “domain ID”. For example, "HR > address-schema": "30000000-0000-0000-0000-000000000000".
    • In the exclude block, you can specify the catalogs or schemas that you don't want to ingest. For example, "* > test".
    • The exclude block has priority over the include block.
    • If the include block is not present, we ingest all assets into the same domain as the System asset.
    • If there is no explicit domain mapping for a schema, we use the domain specified for the database.
    • You can use the keyword default as a domain ID. In that case, the catalog or schema will be ingested in the same domain as the System asset.
    • A match with a database has priority over a match with a schema.
    • The integration fails before the synchronization starts, if one or more domain IDs specified in the include block don't exist.
    • The integration fails before the synchronization starts if a domain ID is left empty in the include block.
    • You can use the ? and * wildcards in the catalog and schema names. If a catalog or schema matches multiple lines, the most detailed match is taken into account.

    No

    (deprecated) Extensible Properties Mapping

    Via the Extensible Properties Mapping field, Databricks Unity Catalog allows you to add additional properties to Catalog, Schema, and Table objects.

    Important 
    • This field is deprecated in the latest UI. You can now define the mappings in the integration configuration. If you have existing mappings here, they will continue to work. However, we advise you to move them to the integration configuration.
    • If you use this feature, make sure to set up all required characteristic assignments for the asset types.

    Three possible JSON formats are available.

    • Version 0.1: This version allows you to ingest custom properties only. You can ingest the values from the Properties field from Catalog, Schema, and Table objects into specific attributes in Collibra assets. You do this by adding the mapping between the Properties fields for the objects in Databricks Unity Catalog and the Collibra attribute IDs to ingest the data in, using a JSON string.
      • The text must be in JSON format and can contain a Catalogs, Schemas, and Tables block. The Catalogs block refers to Database assets, the Schemas block to Schema assets, and the Tables block to Table assets.
      • In each block, you specify the property name and the attribute ID to which you want to map the value in the property. The format is: "[property name]": "[attribute resource ID]". For example, "Description from source system": "00000000-0000-0000-0001-000500000074".
      Example 
      {
      "catalogs": {
      "color": "00000000-0000-0000-0000-000000001234",
      "Description from source system": "00000000-0000-0000-0001-000500000074"
      },
      "schemas": {
      "File Location": "00000000-0000-0000-0001-000500000004"
      },
      "tables": {
      "delta.lastCommitTimestamp": "00000000-0000-0000-0000-000000003114"
      }
      }

      In this example:

      • In the Database assets that we create, we'll add the Color value in attribute 00000000-0000-0000-0000-000000001234, and the Description from Source value in attribute 00000000-0000-0000-0001-000500000074.
      • In the Schema assets that we create, we'll add the File Location value in attribute 00000000-0000-0000-0001-000500000004.
      • In the Table assets that we create, we'll add the delta.lastCommitTimestamp value in attribute 00000000-0000-0000-0000-000000003114.
    • Version 0.2: This version allows you to ingest both default system properties and custom properties. You can ingest most values from the Details page from Catalog, Schema, and Table objects into specific attributes in Collibra assets. You do this by adding the mapping between the fields for the objects in Databricks Unity Catalog and the Collibra attribute IDs to ingest the data in, using a JSON string.
      • The text must be in JSON format.
      • A Version block referencing 0.2 must be added.
      • A Catalogs, Schemas, and Tables block can be added. The Catalogs block refers to Database assets, the Schemas block to Schema assets, and the Tables block to Table assets.
      • Inside a Catalogs, Schemas, or Tables block, you can add a systemAttributes and a customParameters block. systemAttributes refers to the default system properties. customParameters refers to the custom properties.
      • In each block, you specify the property name and the attribute ID to which you want to map the value in the property. The format is: "[property name]": "[attribute resource ID]". For example, "Description from source system": "00000000-0000-0000-0001-000500000074".
        Following system properties are supported:
        • Catalogs: "browse_only", "catalog_type", "connection_name", "created_at", "created_by", "isolation_mode", "metastore_id", "provider_name", "provisioning_info", "securable_kind", "securable_type", "share_name", "storage_location", "storage_root", "updated_at" , and "updated_by".
        • Schemas: "catalog_type", "created_at", "created_by", "metastore_id", "securable_type", "securable_kind", "storage_location", "storage_root", "updated_at", and "updated_by".
        • Tables: "access_point", "created_at", "created_by", "data_access_configuration_id", "data_source_format", "deleted_at", "metastore_id", "securable_type", "securable_kind", "sql_path", "storage_credential_name", "storage_location", "table_type", "updated_at", "updated_by", and "view_definition".
          Tables mapping apply to tables and views.
      Example 
      {
      "version": 0.2,
      "catalogs": {
      "systemAttributes": {
      "metastore_id": "00000000-0000-0000-0000-000000004224"
      },
      "customParameters": {
      "color": "00000000-0000-0000-0000-000000001234",
      "Description from source system": "00000000-0000-0000-0001-000500000074"
      }
      },
      "schemas": {
      "customParameters": {
      "File Location": "00000000-0000-0000-0001-000500000004"
      }
      },
      "tables": {
      "systemAttributes": {
      "metastore_id": "00000000-0000-0000-0000-000000004224"
      },
      "customParameters": {
      "delta.lastCommitTimestamp": "00000000-0000-0000-0000-000000003114"
      }
      }
      }

      In this example:

      • In the Database assets that we create, we'll add the metastore_id value in attribute "00000000-0000-0000-0000-000000004224", the Color value in attribute 00000000-0000-0000-0000-000000001234, and the Description from Source value in attribute 00000000-0000-0000-0001-000500000074.
      • In the Schema assets that we create, we'll add the File Location value in attribute 00000000-0000-0000-0001-000500000004.
      • In the Table and View assets that we create, we'll add the metastore_id value in attribute "00000000-0000-0000-0000-000000004224" and the delta.lastCommitTimestamp value in attribute 00000000-0000-0000-0000-000000003114.
    • Version 0.3: This version allows you to ingest both default system properties and custom properties, and define separate decisions for tables and views. You can ingest most values from the Details page from Catalog, Schema, Table, and View objects into specific attributes in Collibra assets. You do this by adding the mapping between the fields for the objects in Databricks Unity Catalog and the Collibra attribute IDs to ingest the data in, using a JSON string.
      • The text must be in JSON format.
      • A Version block referencing 0.3 must be added.
      • A Catalogs, Schemas, Tables, and Views block can be added. The Catalogs block refers to Database assets, the Schemas block to Schema assets, the Tables block to Table assets, and the Views block to Database View assets.
      • Inside a Catalogs, Schemas, Tables, or Views block, you can add a systemAttributes and a customParameters block. systemAttributes refers to the default system properties. customParameters refers to the custom properties.
      • In each block, you specify the property name and the attribute ID to which you want to map the value in the property. The format is: "[property name]": "[attribute resource ID]". For example, "Description from source system": "00000000-0000-0000-0001-000500000074".
        Following system properties are supported:
        • Catalogs: "browse_only", "catalog_type", "connection_name", "created_at", "created_by", "isolation_mode", "metastore_id", "provider_name", "provisioning_info", "securable_kind", "securable_type", "share_name", "storage_location", "storage_root", "updated_at" , and "updated_by".
        • Schemas: "catalog_type", "created_at", "created_by", "metastore_id", "securable_type", "securable_kind", "storage_location", "storage_root", "updated_at", and "updated_by".
        • Tables: "access_point", "created_at", "created_by", "data_access_configuration_id", "data_source_format", "deleted_at", "metastore_id", "securable_type", "securable_kind", "sql_path", "storage_credential_name", "storage_location", "table_type", "updated_at", "updated_by", and "view_definition".
        • Views: "access_point", "created_at", "created_by", "data_access_configuration_id", "data_source_format", "deleted_at", "metastore_id", "securable_type", "securable_kind", "sql_path", "storage_credential_name", "storage_location", "table_type", "updated_at", "updated_by", and "view_definition".
      Example 
      {
      "version": 0.3,
      "catalogs": {
      "systemAttributes": {
      "metastore_id": "00000000-0000-0000-0000-000000004224"
      },
      "customParameters": {
      "color": "00000000-0000-0000-0000-000000001234",
      "Description from source system": "00000000-0000-0000-0001-000500000074"
      }
      },
      "schemas": {
      "customParameters": {
      "File Location": "00000000-0000-0000-0001-000500000004"
      }
      },
      "tables": {
      "systemAttributes": {
      "metastore_id": "00000000-0000-0000-0000-000000004224"
      },
      "customParameters": {
      "delta.lastCommitTimestamp": "00000000-0000-0000-0000-000000003114"
      }
      }
      "views": {
      "systemAttributes": {
      "metastore_id": "00000000-0000-0000-0000-000000004224"
      },
      "customParameters": {
      "view.sqlConfig.spark.sql.session.timeZone": "018cedbf-37fc-7da3-9ea8-da2af754222e"
      }
      }
      }

      In this example:

      • In the Database assets that we create, we'll add the metastore_id value in attribute "00000000-0000-0000-0000-000000004224", the Color value in attribute 00000000-0000-0000-0000-000000001234, and the Description from Source value in attribute 00000000-0000-0000-0001-000500000074.
      • In the Schema assets that we create, we'll add the File Location value in attribute 00000000-0000-0000-0001-000500000004.
      • In the Table assets that we create, we'll add the metastore_id value in attribute "00000000-0000-0000-0000-000000004224" and the delta.lastCommitTimestamp value in attribute 00000000-0000-0000-0000-000000003114.
      • In the Database View assets that we create, we'll add the metastore_id value in attribute "00000000-0000-0000-0000-000000004224" and the view.sqlConfig.spark.sql.session.timeZone value in attribute 018cedbf-37fc-7da3-9ea8-da2af754222e.

    No

    Compute Resource HTTP Path

    The HTTP path of the compute resource in Databricks Unity Catalog that we can process to extract the source tags.

    You can find the HTTP path in the connection details of your cluster. For details, go to Get connection details for a cluster in Databricks documentation.

    No

    Advanced Configuration
    • Logging configuration
    • Memory
    • JVM arguments

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This setting is not valid for this integration. It should be set to false.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    ADLS synchronization

    Yes

    ADLS service account

    This section contains the information on how to connect to Azure Data Lake Storage.
    Azure Connection
    The ADLS connection to be used.

    Yes

    Synchronization Source

    Choose which Microsoft data sources you want to integrate from.

    Note This option is made available for private beta testing and should not yet be used outside the private beta tasks.

    No

    Microsoft Purview Account Name
    The name of your Microsoft Purview account.
    If you enter a Purview account name, the integration uses Microsoft Purview for the integration.

    No

    Save Input Metadata
    If you select this option the metadata extracted from the data source will be saved in a file that can be used for troubleshooting. Select this option only on request of Collibra Support.

    No

    Max Schema Level

    For columns that have a structured technical data type, Array or Struct, you can register the structure of the data. This is supported for AVRO, CSV, JSON, ORC, PARQUET, PSV, SSV, TSV, TXT, and XML.

    In this field, enter the maximum level of the structure you want to see. For example, 3.

    Note If you include a high number of levels, this can have an impact on the integration performance.

    No

    Advanced Configuration
    • Logging configuration
    • Memory
    • JVM arguments

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This setting is not valid for this integration. It should be set to false.

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Azure ConnectionThe Azure connection to be used.

    Yes

    Subscription IDThe ID of your Azure subscription.

    Yes

    Advanced Configuration

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This field is ignored when you integrate metadata from Azure ML.

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    This field is ignored when you integrate metadata from the Azure ML.

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following capability template to ingest Collibra Data Quality & Observability user-defined rules, metrics, and dimensions into Collibra Data Catalog:

    DQ Connector

    Yes

    DQ

    This section contains information about the Collibra Data Quality & Observability connection.
    Base URL
    Your Collibra Data Quality & Observability URL

    Yes

    Username
    The Collibra Data Quality & Observability username for this connection.

    Yes

    Password
    The Collibra Data Quality & Observability password for this connection.

    Yes

    Encryption options

    Select the type of encryption to use.

    Default: To be encrypted by Edge management server.

    Issuer of the JWT
    If you have selected Encrypted with public key, enter your JWT issuer.

    No

    Collibra metadata modelThis section contains information about where to ingest Collibra Data Quality & Observability assets.
    DQ Rules domain id
    The UUID of the Rulebook Domain for the ingested Collibra Data Quality & Observability rules.

    Yes

    DQ Metrics domain id
    The UUID of the Rulebook Domain for the ingested Collibra Data Quality & Observability metrics.

    Yes

    DQ Dimensions domain id
    The UUID of the Governance Asset Domain for the ingested Collibra Data Quality & Observability dimensions.

    Yes

    Default DQ Dimension name

    The default Data Quality Dimension, for example Accuracy, Completeness, Consistency and so on.

    Default: Completeness.

    Yes

    DQ Metric classified by DQ Dimension relation type id
    The UUID of the Data Quality Metric classified by / classifies Data Quality Dimension relation. If left unspecified, this relation will not be added.

    No

    Assets are imported in batches of this size

    The batch size of the ingestion.

    Default: 5000.

    Yes

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select your Edge capability template.

    Note When you select a capability template, you may need to add required custom properties. For example, if you select the S3 synchronization capability template, you have to add credentials to configure the S3 connection.

    Yes

    General

    This section contains general information about logging.

    Debug

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    GCS synchronization

    Yes

    GCP service account

    This section contains information on how to connect to Google Cloud Storage.
    GCP Connection
    The GCP connection to be used.

    Yes

    ConfigurationThis section contains information on the configuration of the crawlers. 
    Maximum number of files per crawler
    The maximum number of files that can be registered per crawler. The default value is 1,000.

    Yes

    Save input metadata

    Select the checkbox if you want to save the input metadata extracted from the data source in ZIP files. The files can be useful for troubleshooting. Select this option only on request of Collibra Support. The Collibra Support team can provide the location of the saved ZIP files after the synchronization.

    This checkbox is not selected by default.

    No

    Integrate Schemas from Dataplex

    Select the checkbox if you want to integrate the schemas from Dataplex based on the crawler path that will be specified in the GCS integration configuration.
    If the checkbox is not selected, no Dataplex data will be ingested.

    This checkbox is selected by default.

    No

    Project IDs
    Add a comma-separated list of the Project IDs where Dataplex is enabled.
    The capability will search in these projects for schemas based on the crawler path that will be specified in the GCS integration configuration. If the Project IDs field is empty, the integration will search in the project included in the provided GCP Service Account Credentials JSON.

    No

    Advanced Configuration
    • Logging configuration
    • Memory
    • JVM arguments

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This setting is not valid for this integration. It should be set to false.

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Catalog JDBC ingestion

    Yes

    Connection

    This section contains information to connect to the data source.

    JDBC connection

    The connection to the data source.

    Yes

    JDBC data source type (Deprecated)

    Deprecated field. The field was used to indicate the type of the data source. You no longer need to change this field. The required value is automatically identified.

    Note The automatically identified value is not shown in this page.

    Yes

    Supports schemas

    A text field where you have to enter True to enable database registration of data sources that have no schema. If the data source has schemas, you can ignore this field.

    Tip If the data source does not have a schema, Data Catalog creates a Schema asset with the same name as the full name of the database.

    No

    Other Settings

    Others

    This section can contain additional capability properties.
    Click Add propertyAdd Other Settings to add a property.ClosedShow possible properties

    Note No validation is performed on the values you add.

    No

    General

    This section contains general information about logging.

    Debug

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    For more information, go to logging.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    JDBC Profiling

    Yes

    Connection

    This section contains information to connect to the data source.

    JDBC connection

    The connection to the data source.

    Yes

    Other Settings

    Others

    This section can contain additional capability properties.

    Warning Adding additional properties can have a significant impact on your Edge site. Only add or update them together with Collibra Support.

    Click Add propertyAdd Other Settings to add a property.
    The possible properties are: ClosedShow properties

    Note No validation is performed on the values you add.

    No

    General

    This section contains general information about logging.

    Debug

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    For more information, go to logging.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    AWS ConnectionThe AWS connection to be used.

    Yes

    Advanced Configuration

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This field is ignored when you integrate metadata from Amazon SageMaker.

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    This field is ignored when you integrate metadata from the Amazon SageMaker.

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    AWS ConnectionThe AWS connection to be used.

    Yes

    Advanced Configuration

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This field is ignored when you integrate metadata from Amazon Bedrock.

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    This field is ignored when you integrate metadata from the Amazon Bedrock.

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    SAP AI Core ConnectionThe SAP AI Core connection to be used.

    Yes

    Advanced Configuration

    These configuration options help when investigating issues with the capability.

    Important Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.

    No

    Debug

    This field is ignored when you integrate metadata from SAP AI Core.

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    This field is ignored when you integrate metadata from the SAP AI Core.

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired

    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Catalog JDBC Sampling

    Yes

    Connection

    This section contains information to connect to the data source.

    JDBC connection

    The connection to the data source.

    Yes

    General

    This section contains general information about logging.

    Debug

    An option to automatically send Edge infrastructure log files to Collibra Data Intelligence Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    For more information, go to logging.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for Amazon Redshift

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Important This field is mandatory, but the value you specify is not taken into consideration. We will remove this field in a future Collibra version.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column.
    ViewsThis query retrieves the view definitions.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Logging

    This section contains general information about logging.

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Important This field is mandatory, but the value you specify is not taken into consideration. We will remove this field in a future Collibra version.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column.
    ViewsThis query retrieves the view definitions.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Authentication Type

    The authentication details for signing in to Azure Data Factory. You can select one of the following values:

    Service Principal
    When you select this authentication type, ensure that you entered the application secret for the Service Principal in the Service Principal Secret field when you created the Azure connection.
    Resource Owner Password Credentials
    When you select this authentication type, ensure that you specify the username field in this capability and also entered the password in the Service Principal Secret field when you created the Azure connection.

    Yes

    ADF Connection

    The Azure connection that you created.

    Yes

    Username

    The email address of your Azure Active Directory user.

    This field applies only when you selected Resource Owner Password Credentials for the field.

    No

    Resource Group Name

    The name of the resource group that the data factory belongs to.

    Yes

    Subscription ID

    The subscription ID of the resource group.

    Yes

    Factories

    The Azure Data Factory factories that Collibra Data Lineage collects and processes. Specify this property with an array of Azure Data Factory factory names. This property is optional.

    The following rules apply when you specify this property:

    • Enter the factory names in square brackets ([ ]), enclose each factory name in double quotes (" "), and separate them by a comma, for example, ["MyFirstFactory", "MySecondFactory"].
    • The factory name is not case-sensitive. For example, the MyFactory and myfactory factories are considered the same by Azure Data Factory and Collibra Data Lineage.
    • If you do not specify any factory name, Collibra Data Lineage collects and processes all factories that have datasets and piplelines in them.

    No

    Source Configuration

    The source configuration for database mapping, system mapping, schema mapping, and filtering. Specify the following properties in JSON format and enter the content in this field.

    If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the <sourceId>.conf file in this field.
    Example 

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    This setting is not valid for this integration. It should be set to false.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for ADF

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Authentication Type

    The authentication details for signing in to Azure Data Factory. You can select one of the following values:

    Service Principal
    When you select this authentication type, ensure that you entered the application secret for the Service Principal in the Service Principal Secret field when you created the Azure connection.
    Resource Owner Password Credentials
    When you select this authentication type, ensure that you specify the username field in this capability and also entered the password in the Service Principal Secret field when you created the Azure connection.

    Yes

    ADF Connection

    The Azure connection that you created.

    Yes

    Username

    The email address of your Azure Active Directory user.

    This field applies only when you selected Resource Owner Password Credentials for the field.

    No

    Resource Group Name

    The name of the resource group that the data factory belongs to.

    Yes

    Subscription ID

    The subscription ID of the resource group.

    Yes

    Factories

    The Azure Data Factory factories that Collibra Data Lineage collects and processes. Specify this property with an array of Azure Data Factory factory names. This property is optional.

    The following rules apply when you specify this property:

    • Enter the factory names in square brackets ([ ]), enclose each factory name in double quotes (" "), and separate them by a comma, for example, ["MyFirstFactory", "MySecondFactory"].
    • The factory name is not case-sensitive. For example, the MyFactory and myfactory factories are considered the same by Azure Data Factory and Collibra Data Lineage.
    • If you do not specify any factory name, Collibra Data Lineage collects and processes all factories that have datasets and piplelines in them.

    No

    Source Configuration

    The source configuration for database mapping, system mapping, schema mapping, and filtering. Specify the following properties in JSON format and enter the content in this field.

    If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the connection_definitions.conf file in this field.
    Example 

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

     
    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Logging

    This section contains the properties for debug logging. This setting is not valid for this integration.

     

    Debug

    This setting is not valid for this integration. It should be set to false. No
    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    GCP Connection

    The GCP connection that you created.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    No

    Save Input Metadata

    This option determines whether you want to save the input metadata that is extracted from the data source in a file. The file can be useful for troubleshooting. Select this option only on request of Collibra Support.

    No

    Logging configuration, Memory (MiB), and JVM arguments

    These fields contain configuration options that can help when investigating issues with the capability.

    Important Only complete these fields on request of or together with Collibra Support.

    No

    Debug

    This setting is not valid for this integration. It should be set to false.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for Azure

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Yes

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.

    Synonyms

    This query retrieves the alternative names for the database objects.

    Views

    This query retrieves the view definitions.

    Other QueriesThis query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

     

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Logging

    This section contains general information about logging.

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Yes

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.

    Synonyms

    This query retrieves the alternative names for the database objects.

    Views

    This query retrieves the view definitions.

    Other QueriesThis query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for Azure

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Important This field is mandatory, but the value you specify is not taken into consideration. We will remove this field in a future Collibra version.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.

    Synonyms

    This query retrieves the alternative names for the database objects.

    Views

    This query retrieves the view definitions.

    Other QueriesThis query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

     

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Logging

    This section contains general information about logging.

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Important This field is mandatory, but the value you specify is not taken into consideration. We will remove this field in a future Collibra version.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.

    Synonyms

    This query retrieves the alternative names for the database objects.

    Views

    This query retrieves the view definitions.

    Other QueriesThis query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for Azure

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Yes

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.

    Synonyms

    This query retrieves the alternative names for the database objects.

    Views

    This query retrieves the view definitions.

    Other QueriesThis query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

     

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Logging

    This section contains general information about logging.

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.

    Synonyms

    This query retrieves the alternative names for the database objects.

    Views

    This query retrieves the view definitions.

    Other QueriesThis query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirectory

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    No

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    Yes

    Dialect

    The dialect of the database.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Schema

    The name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.

    Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.

    Yes

    Database-System mapping

    This optional field allows you to map databases to their rightful systems, to obtain stitching. This resolves missing stitching, which occurs when Collibra Data Lineage associates multiple databases with the default system name that you provide in the Collibra System Name field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

     

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for Custom Technical Lineage

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

     

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    This setting is not valid for this integration. It should be set to false.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    No

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Databricks Connection

    The Databricks connection that you created.

    Yes

    Compute Resource HTTP Path

    The HTTP path of the compute resource in Databricks Unity Catalog that Collibra Data Lineage collects and processes to create technical lineage.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Time Frame

    Specify the duration for data collection. The default value is 365, which means that Collibra Data Lineage collects the data of the past 365 days.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Save Input Metadata

    This option determines whether you want to save the input metadata that is extracted from the data source in a file. The file can be useful for troubleshooting. Select this option only on request of Collibra Support.

    No

    Filters

    Use this section to include or exclude databases and schemas to be ingested. Enter the filters in JSON format. If you used filters when you integrated Databricks Unity Catalog, you can enter in this field the content from the Filters and Domain Mapping field in the Databricks Unity Catalog capability. Noted that Collibra Data Lineage ignores the UUIDs that are specified in the content.

    Text in JSON format to include or exclude databases and schemas, and to configure domain mappings.

    • The text must be in JSON format and can contain an include and an exclude block. You can use any JSON validator to verify the format. Collibra is not responsible for the privacy, confidentiality, or protection of the data you submit to such JSON validators, and has no liability for such use.
    • In the include block, you can specify the domain in which specific catalogs or schemas must be ingested. The format is: “Catalog/Database > schema ”: “domain ID”. For example, "HR > address-schema": "30000000-0000-0000-0000-000000000000".
    • In the exclude block, you can specify the catalogs or schemas that you don't want to ingest. For example, "* > test".
    • The exclude block has priority over the include block.
    • If the include block is not present, we ingest all assets into the same domain as the System asset.
    • If there is no explicit domain mapping for a schema, we use the domain specified for the database.
    • You can use the keyword default as a domain ID. In that case, the catalog or schema will be ingested in the same domain as the System asset.
    • A match with a database has priority over a match with a schema.
    • The integration fails before the synchronization starts, if one or more domain IDs specified in the include block don't exist.
    • The integration fails before the synchronization starts if a domain ID is left empty in the include block.
    • You can use the ? and * wildcards in the catalog and schema names. If a catalog or schema matches multiple lines, the most detailed match is taken into account.

    No

    Logging configuration, Memory (MiB), and JVM arguments

    These fields contain configuration options that can help when investigating issues with the capability.

    Important Only complete these fields on request of or together with Collibra Support.

    No

    Debug

    This setting is not valid for this integration. It should be set to false.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for Databricks Unity Catalog

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Databricks Connection

    The Databricks connection that you created.

    Yes

    Compute Resource HTTP Path

    The HTTP path of the compute resource in Databricks Unity Catalog that Collibra Data Lineage collects and processes to create technical lineage.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Time Frame

    Specify the duration for data collection. The default value is 365, which means that Collibra Data Lineage collects the data of the past 365 days.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

    Save Input Metadata

    This option determines whether you want to save the input metadata that is extracted from the data source in a file. The file can be useful for troubleshooting. Select this option only on request of Collibra Support.

    No

    Filters

    Use this section to include or exclude databases and schemas to be ingested. Enter the filters in JSON format. If you used filters when you integrated Databricks Unity Catalog, you can enter in this field the content from the Filters and Domain Mapping field in the Databricks Unity Catalog capability. Noted that Collibra Data Lineage ignores the UUIDs that are specified in the content.

    Text in JSON format to include or exclude databases and schemas, and to configure domain mappings.

    • The text must be in JSON format and can contain an include and an exclude block. You can use any JSON validator to verify the format. Collibra is not responsible for the privacy, confidentiality, or protection of the data you submit to such JSON validators, and has no liability for such use.
    • In the include block, you can specify the domain in which specific catalogs or schemas must be ingested. The format is: “Catalog/Database > schema ”: “domain ID”. For example, "HR > address-schema": "30000000-0000-0000-0000-000000000000".
    • In the exclude block, you can specify the catalogs or schemas that you don't want to ingest. For example, "* > test".
    • The exclude block has priority over the include block.
    • If the include block is not present, we ingest all assets into the same domain as the System asset.
    • If there is no explicit domain mapping for a schema, we use the domain specified for the database.
    • You can use the keyword default as a domain ID. In that case, the catalog or schema will be ingested in the same domain as the System asset.
    • A match with a database has priority over a match with a schema.
    • The integration fails before the synchronization starts, if one or more domain IDs specified in the include block don't exist.
    • The integration fails before the synchronization starts if a domain ID is left empty in the include block.
    • You can use the ? and * wildcards in the catalog and schema names. If a catalog or schema matches multiple lines, the most detailed match is taken into account.

    No

    Logging configuration, Memory (MiB), and JVM arguments

    These fields contain configuration options that can help when investigating issues with the capability.

    Important Only complete these fields on request of or together with Collibra Support.

    No

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    No

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    Yes

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    No

    Source Configuration

    The connection definitions, where you specify relevant translations for each data source. Specify the following properties in JSON format and enter the content in this field.

    If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the <sourceId>.conf file in this field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    This setting is not valid for this integration. It should be set to false.

    No

    Log level

    This setting is not valid for this integration. It should be set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for DataStage

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    Shared Storage Connection

    The Shared Storage connection that you created.

    Yes

    Mask

    The pattern of the file names in the directory. By default, the value is *.

    No

    Source Configuration

    The connection definitions, where you specify relevant translations for each data source. Specify the following properties in JSON format and enter the content in this field.

    If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the connection_definitions.conf file in this field.

    No

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

     
    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for Db2

    Yes

    Main Properties

    This section contains the information for creating a technical lineage.

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Important This field is mandatory, but the value you specify is not taken into consideration. We will remove this field in a future Collibra version.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column.
    ViewsThis query retrieves the view definitions.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Advanced Properties

    This section contains the advanced properties for creating a technical lineage.

     

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Logging

    This section contains general information about logging.

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    FieldDescriptionRequired?

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Source ID

    The name of the data source. Specify a name that is unique.

    Yes

    JDBC Connection

    The JDBC connection that you created for Catalog JDBC ingestion.

    Yes

    Collibra System Name

    The system or server name of the data source. This field is also the full name of your System asset in Data Catalog.

    The value of this field must be the same as the full name of the System asset that you created when you registered the data source.

    Yes

    Database Name

    The name of your database, which is also the name of your Database asset in Data Catalog.

    Important This field is mandatory, but the value you specify is not taken into consideration. We will remove this field in a future Collibra version.

    Yes

    Database Name Override

    We strongly recommend that you not edit the full name of your System, Database and Schema assets in Data Catalog. Doing so can lead to errors during the technical lineage creation process. If stitching is missing specifically because you edited the full name of your Database asset, you can use this field to specify the current name of your Database asset in Data Catalog.

    No

    Queries

    The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.

    Example Enter the following filter in a Views query: where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for example where v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.

    Note 
    • If you change queries, you can only use supported SQL syntax.
    • Collibra Support does not provide support for customized SQL files.
    QueryDescription
    ColumnsThis query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column.
    ViewsThis query retrieves the view definitions.

    Yes

    Property

    This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

    You can use this field to set the HTTP timeout duration by adding the httpTimeout property: 

    Warning If you are a Collibra Cloud for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

    Yes for US government customers.

    Dependent On Sources

    This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

    To use this option, enter the source ID of the independent source.

    For complete information, go to Sharing database models across data sources.

    No

    Delete Raw Metadata After Processing

    Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

    Select this option to indicate that the raw source metadata is deleted after processing.

    Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

    No

    Analyze Only

    This option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.

    When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the load-sources and analyze command with a source specified when you use the lineage harvester.

    This option is not enabled by default.

    No

    Active

    The option determines whether to include or remove the technical lineage of the data source.

    Select this option to include the technical lineage of this data source.

    Clear the checkbox to exclude the technical lineage of this data source.

    Yes

    Debug

    An option to enable logging of a JDBC job. If you enable logging, you can download the output file of the JDBC job in the Edge Jobs dashboard (beta). The output file contains the logs of the JDBC driver. For more information about downloading the output file, go to Download job output files.

    Select one of the following values:

    True
    Enables logging of the JDBC job.
    False
    Disables logging of the JDBC job. This is the default value.

    No

    Log level

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

    FieldDescriptionRequired?
    Capability

    This section contains general information about the capability.

    Name

    The name of the Edge capability.

    Yes

    Description

    The description of the Edge capability.

    Yes

    Capability template

    The capability template. The value that you select in this field determines which sections appear on the page.

    Select the following Edge capability:

    Technical Lineage for SqlDirector