Add an Edge capability to an Edge site
After you have created and installed an Edge site, you can add an Edge capability to perform specific tasks on a data source. For example, you can register a data source by using a JDBC connection that belongs to an Edge capability.
Prerequisites
- You have a global role that has the System administration global permission.
- You have a global role with the Manage connections and capabilities global permission, for example Edge integration engineer.
- Optional: You have a global role with the Register profiling information global permission.
- You have created and installed an Edge site.
- You have created a JDBC connection.
Steps
The information in this section varies depending on the capability template that you select.
Capability template
|
Select a data source and the connection type if needed to see the related information. Currently, you see the information for: |
Amazon Redshift
Azure SQL Data Warehouse
Azure SQL Server
Azure Synapse Analytics
DB2
Google BigQuery
Greenplum
HiveQL
IBM InfoSphere DataStage
Informatica Intelligent Cloud Services
Informatica PowerCenter
Matillion
Oracle
PostgreSQL
MySQL
Netezza
SAP Hana
Snowflake
Spark SQL
SQL Server
SQL Server Integration Services
Sybase
Teradata
Custom technical lineage
|
Which connection type do you use?
For best technical lineage results, use the JDBC connection to ingest JDBC sources when possible, rather than using the Shared Storage connection with SQL files. |
- Open an Edge site.
-
On the main menu, click
, and then click
Settings.
The Settings page opens. -
In the tab pane, click Edge.
The Edge sites overview appears. - In the Edge site overview, click the name of an Edge site with the status Healthy.
The Edge site page appears.
-
On the main menu, click
- In the Capabilities section, click Add capability.
The Add capability page appears. - Enter the required information.
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
S3 synchronization
Yes
S3 service account
This section contains the information on how to connect to Amazon S3. AWS ConnectionThe AWS connection to be used.
Yes
IAM roleThe IAM role used by the AWS Glue crawlers.
Yes
Encryption optionsSelect the type of encryption used to store the IAM role.
Default: To be encrypted by Edge management server.
Yes
Delete Glue database left after previous synchronization of the file systemSelect the checkbox if you want the capability to delete the Glue database created by previous runs of the capability, before the capability starts the synchronization.
If you deselect this checkbox, the Glue database created by previous runs is not removed. This can be useful for troubleshooting.By default, this checkbox is selected.
No
Save input metadataSelect the checkbox if you want to save the input metadata extracted from the data source in ZIP files. The files can be useful for troubleshooting.
Select this option only on request of Collibra Support. The Collibra Support team can provide the location of the saved ZIP files after the S3 synchronization.By default, this checkbox is not selected.
No
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following capability template to ingest Collibra Data Quality & Observability user-defined rules, metrics, and dimensions into Collibra Data Catalog:
DQ Connector
Yes
DQ
This section contains information about the Collibra Data Quality & Observability connection. Base URLYour Collibra Data Quality & Observability URL
Yes
UsernameThe Collibra Data Quality & Observability username for this connection.
Yes
PasswordThe Collibra Data Quality & Observability password for this connection.
Yes
Encryption optionsSelect the type of encryption to use.
Default: To be encrypted by Edge management server.
Issuer of the JWTIf you have selected Encrypted with public key, enter your JWT issuer.
No
Collibra metadata model This section contains information about where to ingest Collibra Data Quality & Observability assets. DQ Rules domain idThe UUID of the Rulebook Domain for the ingested Collibra Data Quality & Observability rules.
Yes
DQ Metrics domain idThe UUID of the Rulebook Domain for the ingested Collibra Data Quality & Observability metrics.
Yes
DQ Dimensions domain idThe UUID of the Governance Asset Domain for the ingested Collibra Data Quality & Observability dimensions.
Yes
Default DQ Dimension nameThe default Data Quality Dimension, for example Accuracy, Completeness, Consistency and so on.
Default: Completeness.
Yes
DQ Metric classified by DQ Dimension relation type idThe UUID of the Data Quality Metric classified by / classifies Data Quality Dimension relation. If left unspecified, this relation will not be added.
No
Assets are imported in batches of this sizeThe batch size of the ingestion.
Default: 5000.
Yes
General
This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select your Edge capability template.
Note When you select a capability template, you may need to add required custom properties. For example, if you select the S3 synchronization capability template, you have to add credentials to configure the S3 connection.
Yes
General
This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
GCS synchronization
Yes
GCP service account
This section contains information on how to connect to Google Cloud Storage. GCP ConnectionThe GCP connection to be used.
Yes
Configuration This section contains information on the configuration of the crawlers. Maximum number of files per crawlerThe maximum number of files that can be registered per crawler. The default value is 100.
Yes
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Catalog JDBC ingestion
Yes
Connection
This section contains information to connect to the data source.
JDBC connection
Yes
JDBC data source type (Deprecated)Deprecated field. The field was used to indicate the type of the data source. You no longer need to change this field. The required value is automatically identified.
Note The automatically identified value is not shown in this page.
Yes
Supports schemasA text field where you have to enter True to enable database registration of data sources that have no schema. If the data source has schemas, you can ignore this field.
Tip If the data source does not have a schema, Data Catalog creates a Schema asset with the same name as the full name of the database.
No
Others
This section can contain additional capability properties. Warning Adding additional properties can have a significant impact on your Edge site. Only add or update them together with Collibra Support.
Note No validation is performed on the values you add.
No
General
This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
JDBC Profiling
Yes
Connection
This section contains information to connect to the data source.
JDBC connection
Yes
Others
This section can contain additional capability properties. Warning Adding additional properties can have a significant impact on your Edge site. Only add or update them together with Collibra Support.
Note No validation is performed on the values you add.
No
General
This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Catalog JDBC Sampling
Yes
Connection
This section contains information to connect to the data source.
JDBC connection
Yes
General
This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Redshift
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability The capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for SparkNameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Select the Technical lineage capability template for your data source to create a technical lineage for the JDBC data source.
Important Technical lineage via Edge is only available in private beta. Please create a support ticket to get access.
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Object names This query retrieves a list of object names from which technical lineage can be created. The objects can include stored procedures, views, macros, and so on.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Azure
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Azure
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Azure
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
SchemaThe name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for DataStage
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
MaskThe pattern of the file names in the directory. By default, the value is
*.
No
Source ConfigurationThe connection definitions, where you specify relevant translations for each data source. Specify the following properties in JSON format and enter the content in this field.
Connection definition propertiesProperty
Description OdbcDataSources
Open Database Connectivity data sources in IBM InfoSphere DataStage for which you want to create a technical lineage.
<data-source-name>The ODBC data source name that you use in your DataStage projects.
This section contains the properties to translate the database, schema and dialect.
dbnameThe name of your database, to which the ODBC data source connection refers. schemaThe name of your schema, to which the ODBC data source connection refers.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNameThe name of the data source's system or server.
This property is only required when you set the value of the Collibra system name flag setting to
True. Note Specify this property with the same name as the full name of the System asset that you created when you registered the data source.NonOdbcConnectors
Other data source connectors in IBM InfoSphere DataStage for which you want to create a technical lineage. For example, DB2, Oracle or Netezza.
Note This section is optional.
<data-source-connector-ID>The data source username and database of the connector that you use in your DataStage projects. This usually looks like for example admin@database-name. The combination of the username and database name should be unique.
The following section contains the properties to translate the database, schema and dialect.
dbnameThe name of your database, to which the data source connection refers. schemaThe name of your schema, to which the data source connection refers.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNameThe name of the data source's system or server.
This property is only required when you set the value of the Collibra system name flag setting to
True. Specify this property with the same name as the full name of the System asset that you created when you registered the data source.See an example.
{ "OdbcDataSources": { "oracle-data-source": { "dbname": "my-oracle-database", "schema": "my-oracle-schema", "dialect": "oracle", "collibraSystemName": "my-system" }, "mssql-data-source": { "dbname": "my-mssql-database", "schema": "my-mssql-schema", "dialect": "mssql", "collibraSystemName": "my-system" } }, "NonOdbcConnectors": { "admin@database-name": { "dbname": "my-netezza-database", "schema": "my-netezza-schema", "dialect": "netezza", "collibraSystemName": "my-system" }, "admin@second-database-name": { "dbname": "my-second-netezza-database", "schema": "my-second-netezza-schema", "dialect": "netezza", "collibraSystemName": "my-system" } } }Tip Click
to copy the example to your clipboard.
Tip If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the connection_definitions.conf file in this field.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Db2
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Bigquery
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Project IDThe ID of the project.
Tip You can add extra project IDs by clicking Add property.
Yes
RegionThe location of your BigQuery data. This is the region that you specified when you create a data set.
No
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns
This query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.
Columns tail
This query retrieves all columns tails.
Views
This query retrieves the view definitions.
Dataset names
This query retrieves all logical units in the project.
Other queries
This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Greenplum
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Hive
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
External database nameThe database value to be used in the asset path (system -> database -> schema -> table).
No
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Object names This query retrieves a list of object names from which technical lineage can be created. The objects can include stored procedures, views, macros, and so on.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical lineage for Informatica Intelligent Cloud Services (IICS)
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
IICS connectionThe Informatica Intelligent Cloud Services (IICS) connection that you created.
Note Collibra Data Intelligence Cloud 2023.03 or newer is required to use the Informatica Intelligent Cloud Services (IICS) connection.
No
ObjectsThe objects that you want to export. Each object requires a path and a type, for example:
"objects": [ { "path" : "Sales", "type" : "Project" }, { "path" : "Finance/Task_Flows", "type" : "Folder" }, { "path" : "Common/Task_Flows/tf_CalendarDimension", "type" : "Taskflow" } ]Tip For more information about the objects that you can export and the required information, go to the Informatica documentation.
No
Parameter FilesThe Informatica Intelligent Cloud Services parameter files.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Informatica PowerCenter
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
MaskThe pattern of the file names in the directory. By default, the value is
*.
No
Source ConfigurationThe connection definitions and system names. Specify the following properties in JSON format and enter the content in this field.
Connection definitionsProperty
Description connectionDefinitions
This section contains the connection properties to a source in Informatica PowerCenter.
<connectionName>The type of your source or target data source.
This section contains the connection properties to a source or target in Informatica PowerCenter.
dbnameThe name of your source or target database. schemaThe name of your source or target schema.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNames
This section contains the system or server name that is specified in your database and referenced in your connection.
Note This section is only required whenyou set the Collibra system name flag setting to
True.databasesThis section contains the database information. This is required to connect directly to the system or server of the database.
dbnameThe name of the database. The database name is the same as the name you entered in the <connectionName> section. collibraSystemNameThe system or server name of the database.
connectionsThis section contains the connection information. This is required to reference to the system or server of the connection.
connectionNameThe name of the connection.
collibraSystemNameThe system or server name of the connection.
See an example.
{ "connectionDefinitions": { "oracle_source": { "dbname": "oracle-source-database-name1", "schema": "my Oracle source schema", "dialect": "oracle" }, "oracle_target": { "dbname": "oracle-target-database-name2", "schema": "my other oracle target schema", "dialect": "oracle" } }, "collibraSystemNames": { "databases": [ { "dbname": "oracle-source-database-name1", "collibraSystemName": "oracle-system-name1" }, { "dbname": "oracle-target-database-name2", "collibraSystemName": "oracle-system-name2" } ], "connections": [ { "connectionName": "oracle-connection-name1", "collibraSystemName": "oracle-system-name1" }, { "connectionName": "oracle-connection-name2", "collibraSystemName": "oracle-system-name2" } ] } }Tip Click
to copy the example to your clipboard.Important If you are using variables in Informatica PowerCenter, add the value of the variable instead of the name in the connection definitions. For example, if the parameter file contains$DBConnection_dwh=DWH_EXPORTthen you add the following connection definitions:{ "DWH_EXPORT": { "dbname": "DWH", "schema": "DBO" } }Tip If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the source ID configuration file in this field.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical lineage for Matillion
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Matillion connectionThe Matillion connection that you created.
Note Collibra Data Intelligence Cloud 2023.03 or newer is required to use the Matillion connection.
No
Group NameThe name of your group in Matillion.
Yes
Project NameThe name of your project in Matillion.
You can only add the name of one project. If you want to create a technical lineage for other projects, add a technical lineage for Matillion capability for each project.
Yes
Environment NameThe name of your environment in Matillion.
You can only add the name of one environment. If you want to create a technical lineage for other environments, add a technical lineage for Matillion capability for each environment.
Yes
DialectThe dialect of the database.
Select one of the following values:
Snowflake- A Snowflake data source.
Redshift- An Amazon Redshift data source.
Yes
Start timestampThe timestamp of tasks in Matillion, which indicates the amount of metadata that technical lineage via Edge collects.
Specify this field with a UNIX timestamp in milliseconds. The default value is
1, which gets as much history as Matillion provides. Matillion provides 7 days of history by default.
Yes
Source ConfigurationThe source configuration for the data source.
No
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Mysql
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Netezza
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Oracle
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Database links This query retrieves links to other databases. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Materialized views This query retrieves materialized view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Postgres
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Hana
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions. Calculated views This query retrieves calculated views. Dependencies of calculated views This query retrieves dependencies of calculated views. Cross-references of calculated views This query retrieves cross references of calculated views.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
SQL activeAn option to determine whether to include or remove the technical lineage of the data source with the SQL based input.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
No
Calculated views activeAn option to determine whether to include or remove the technical lineage of the data source with the calculated views input.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Important Calculated views are not supported for SAP HANA Cloud. For details, go to Supported data sources for technical lineage.
No
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Snowflake
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Ingestion method The Snowflake ingestion methods that Collibra Data Lineage uses to ingest metadata from Snowflake data sources. Select one of the following values:
- SQL
- The SQL Snowflake ingestion mode. Collibra Data Lineage creates a column-level technical lineage based on SQL statements.
- SQL-API
- The SQL-API Snowflake ingestion mode. Collibra Data Lineage creates a column-level technical lineage based on Snowflake schemas and the access history.
For more information, go to Technical lineage for Snowflake ingestion methods.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.If you select the SQL Snowflake ingestion mode, the following queries apply:
Query Description Columns This query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column. Views
This query retrieves the view definitions.
If you select the SQL-API Snowflake ingestion mode, the following queries apply:
Query Description Object dependencies
This query retrieves view definitions.
Columns joined
This query retrieves table and column definition information.
Access history
This query retrieves lineage and transformation details.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Mssql
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Database links This query retrieves links to other databases. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for SQL Server Integration Services (SSIS)
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
MaskThe pattern of the file names in the directory. By default, the value is
*.
No
Source ConfigurationThe connection definitions, where you specify relevant translations for each data source. Specify the following properties in JSON format and enter the content in this field.
Connection definitionsProperty
Description ConnStringRegExTranslation
The parent element that opens the connection definitions.
<regular expression>
A regular expression that must match one or more connection strings.
NoteImportant considerations:
- By default, the regular expression is not case sensitive. As a consequence, a regular expression can match with connection strings containing uppercase characters or lowercase characters.
- The connection string is part of the SSIS connection manager.
- SSIS connection managers are included in an SSIS package files (DTSX) or in connection manager files (CONMGR).
ExampleRegular expression:
Server=sb-dhub;User ID=SYB_USER2;Initial Catalog=STAGEDB;Port=6306.*
Explanation: The first section, up to .*, is a literal, but not case-sensitive, match of the characters. The dot (.) can match any single character. The asterisk (*) means zero or more of the previous, in this case any character.
Match: Any connection string that starts withServer=sb-dhub;User ID=SYB_USER2;Initial Catalog=STAGEDB;Port=6306.
Example:Server=sb-dhub;User ID=SYB_USER2;Initial Catalog=STAGEDB;Port=6306;Persist Security Info=True;Auto Translate=False;.dbnameThe name of your database, to which the data source connection refers. schemaThe name of your schema, to which the regular expression refers.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNameThe name of the referenced data source's system or server.
The system or server names in table references are considered to be represented by different System assets in Data Catalog. The value of this field is used as the default system or server name.
This property is only required when you set the value of the Collibra system name flag setting to
True. Note Specify this property with the same name as the full name of the System asset that you created when you registered the data source.See an example.
{ "ConnStringRegExTranslation": { "Data Source=dhb-sql-prod;Initial Catalog=SFG_repl_staging;Provider=SQLNCLI11;Integrated Security=SSPI.*": { "dbname": "DATAHUB", "schema": "DBO", "dialect": "mssql", "collibraSystemName" : "WAREHOUSE" }, "Server=sb-dhub;User ID=SYS_USER;Initial Catalog=STAGEDB;Port=6306.*": { "dbname": "STAGEDB", "schema": "STAGE_OWNER", "dialect": "sybase", "collibraSystemName" : "" } } }Tip Click
to copy the example to your clipboard.Tip If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the source ID configuration file in this field.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Sybase
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Teradata
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
External database nameThe database value to be used in the asset path (system -> database -> schema -> table).
No
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Object names This query retrieves a list of object names from which technical lineage can be created. The objects can include stored procedures, views, macros, and so on.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Custom Lineage
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for SqlDirectory
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
No
MaskThe pattern of the file names in the directory. By default, the value is
*.
Yes
DialectThe dialect of the database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
DatabaseThe name of your database.
Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.
Yes
SchemaThe name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.
Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.
Select this option to indicate that the raw source metadata is deleted after processing.
Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Yes
GCP Connection
This section contains information about the GCP connection to be used to connect to Google Cloud Platform. The GCP connection to be used.
Select a GCP connection.
Yes
Field Description Required Capability
This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
No
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Yes
Snowflake Connection
This section contains information about the JDBC connection to be used to connect to Snowflake. The JDBC connection to be used.
Select a JDBC connection.
Yes
- Click Create.
The capability is added to the Edge site.
The fields become read-only.
More information
Technical lineage for JDBC data sources and ETL tools (public beta)