Add a technical lineage capability to an Edge site
After created a Shared Storage connection if needed, you can create a technical lineage by adding a technical lineage capability to the Edge site.
Requirements and permissions
A global role that has the following global permissions:
- System administration
- Manage connections and capabilities, for example Edge integration engineer.
- Register profiling information
Steps
|
Select a data source and the connection type if needed to see the related information. Currently, you see the information for: |
Amazon Redshift
Azure SQL Data Warehouse
Azure SQL Server
Azure Synapse Analytics
DB2
Google BigQuery
Greenplum
HiveQL
IBM InfoSphere DataStage
Informatica Intelligent Cloud Services
Informatica PowerCenter
Matillion
Oracle
PostgreSQL
MySQL
Netezza
SAP Hana
Snowflake
Spark SQL
SQL Server
SQL Server Integration Services
Sybase
Teradata
Custom technical lineage
|
Which connection type do you use?
For best technical lineage results, use the JDBC connection to ingest JDBC sources when possible, rather than using the Shared Storage connection with SQL files. |
- Open an Edge site.
-
On the main menu, click
, and then click
Settings.
The Settings page opens. -
In the tab pane, click Edge.
The Edge sites overview appears. - In the Edge site overview, click the name of an Edge site with the status Healthy.
The Edge site page appears.
-
On the main menu, click
- In the Capabilities section, click Add capability. For Collibra Data Lineage to stitch the data objects in your technical lineage to the assets in Data Catalog, add a Catalog JDBC ingestion capability before you add the technical lineage capability.
The Add capability page appears. - Enter the required information.
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Redshift
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability The capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for SparkNameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Select the Technical lineage capability template for your data source to create a technical lineage for the JDBC data source.
Important Technical lineage via Edge is only available in private beta. Please create a support ticket to get access.
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Object names This query retrieves a list of object names from which technical lineage can be created. The objects can include stored procedures, views, macros, and so on.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Azure
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Azure
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Azure
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
SchemaThe name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for DataStage
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
MaskThe pattern of the file names in the directory. By default, the value is
*.
No
Source ConfigurationThe connection definitions, where you specify relevant translations for each data source. Specify the following properties in JSON format and enter the content in this field.
Connection definition propertiesProperty
Description OdbcDataSources
Open Database Connectivity data sources in IBM InfoSphere DataStage for which you want to create a technical lineage.
<data-source-name>The ODBC data source name that you use in your DataStage projects.
This section contains the properties to translate the database, schema and dialect.
dbnameThe name of your database, to which the ODBC data source connection refers. schemaThe name of your schema, to which the ODBC data source connection refers.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNameThe name of the data source's system or server.
This property is only required when you set the value of the Collibra system name flag setting to
True. Note Specify this property with the same name as the full name of the System asset that you created when you registered the data source.NonOdbcConnectors
Other data source connectors in IBM InfoSphere DataStage for which you want to create a technical lineage. For example, DB2, Oracle or Netezza.
Note This section is optional.
<data-source-connector-ID>The data source username and database of the connector that you use in your DataStage projects. This usually looks like for example admin@database-name. The combination of the username and database name should be unique.
The following section contains the properties to translate the database, schema and dialect.
dbnameThe name of your database, to which the data source connection refers. schemaThe name of your schema, to which the data source connection refers.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNameThe name of the data source's system or server.
This property is only required when you set the value of the Collibra system name flag setting to
True. Specify this property with the same name as the full name of the System asset that you created when you registered the data source.See an example.
{ "OdbcDataSources": { "oracle-data-source": { "dbname": "my-oracle-database", "schema": "my-oracle-schema", "dialect": "oracle", "collibraSystemName": "my-system" }, "mssql-data-source": { "dbname": "my-mssql-database", "schema": "my-mssql-schema", "dialect": "mssql", "collibraSystemName": "my-system" } }, "NonOdbcConnectors": { "admin@database-name": { "dbname": "my-netezza-database", "schema": "my-netezza-schema", "dialect": "netezza", "collibraSystemName": "my-system" }, "admin@second-database-name": { "dbname": "my-second-netezza-database", "schema": "my-second-netezza-schema", "dialect": "netezza", "collibraSystemName": "my-system" } } }Tip Click
to copy the example to your clipboard.
Tip If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the connection_definitions.conf file in this field.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Db2
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Bigquery
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Project IDThe ID of the project.
Tip You can add extra project IDs by clicking Add property.
Yes
RegionThe location of your BigQuery data. This is the region that you specified when you create a data set.
No
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns
This query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column.
Columns tail
This query retrieves all columns tails.
Views
This query retrieves the view definitions.
Dataset names
This query retrieves all logical units in the project.
Other queries
This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Greenplum
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Hive
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
External database nameThe database value to be used in the asset path (system -> database -> schema -> table).
No
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Object names This query retrieves a list of object names from which technical lineage can be created. The objects can include stored procedures, views, macros, and so on.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical lineage for Informatica Intelligent Cloud Services (IICS)
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
IICS connectionThe Informatica Intelligent Cloud Services (IICS) connection that you created.
Note Collibra Data Intelligence Cloud 2023.03 or newer is required to use the Informatica Intelligent Cloud Services (IICS) connection.
No
ObjectsThe objects that you want to export. Each object requires a path and a type, for example:
"objects": [ { "path" : "Sales", "type" : "Project" }, { "path" : "Finance/Task_Flows", "type" : "Folder" }, { "path" : "Common/Task_Flows/tf_CalendarDimension", "type" : "Taskflow" } ]Tip For more information about the objects that you can export and the required information, go to the Informatica documentation.
No
Parameter FilesThe Informatica Intelligent Cloud Services parameter files.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for (Undefined variable: Catalog/data-source-names.informatica-powercenter-full)
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
MaskThe pattern of the file names in the directory. By default, the value is
*.
No
Source ConfigurationThe connection definitions and system names. Specify the following properties in JSON format and enter the content in this field.
Connection definitionsProperty
Description connectionDefinitions
This section contains the connection properties to a source in Informatica PowerCenter.
<connectionName>The type of your source or target data source.
This section contains the connection properties to a source or target in Informatica PowerCenter.
dbnameThe name of your source or target database. schemaThe name of your source or target schema.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNames
This section contains the system or server name that is specified in your database and referenced in your connection.
Note This section is only required whenyou set the Collibra system name flag setting to
True.databasesThis section contains the database information. This is required to connect directly to the system or server of the database.
dbnameThe name of the database. The database name is the same as the name you entered in the <connectionName> section. collibraSystemNameThe system or server name of the database.
connectionsThis section contains the connection information. This is required to reference to the system or server of the connection.
connectionNameThe name of the connection.
collibraSystemNameThe system or server name of the connection.
See an example.
{ "connectionDefinitions": { "oracle_source": { "dbname": "oracle-source-database-name1", "schema": "my Oracle source schema", "dialect": "oracle" }, "oracle_target": { "dbname": "oracle-target-database-name2", "schema": "my other oracle target schema", "dialect": "oracle" } }, "collibraSystemNames": { "databases": [ { "dbname": "oracle-source-database-name1", "collibraSystemName": "oracle-system-name1" }, { "dbname": "oracle-target-database-name2", "collibraSystemName": "oracle-system-name2" } ], "connections": [ { "connectionName": "oracle-connection-name1", "collibraSystemName": "oracle-system-name1" }, { "connectionName": "oracle-connection-name2", "collibraSystemName": "oracle-system-name2" } ] } }Tip Click
to copy the example to your clipboard.Important If you are using variables in Informatica PowerCenter, add the value of the variable instead of the name in the connection definitions. For example, if the parameter file contains$DBConnection_dwh=DWH_EXPORTthen you add the following connection definitions:{ "DWH_EXPORT": { "dbname": "DWH", "schema": "DBO" } }Tip If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the source ID configuration file in this field.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical lineage for Matillion
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Matillion connectionThe Matillion connection that you created.
Note Collibra Data Intelligence Cloud 2023.03 or newer is required to use the Matillion connection.
No
Group NameThe name of your group in Matillion.
Yes
Project NameThe name of your project in Matillion.
You can only add the name of one project. If you want to create a technical lineage for other projects, add a technical lineage for Matillion capability for each project.
Yes
Environment NameThe name of your environment in Matillion.
You can only add the name of one environment. If you want to create a technical lineage for other environments, add a technical lineage for Matillion capability for each environment.
Yes
DialectThe dialect of the database.
Select one of the following values:
Snowflake- A Snowflake data source.
Redshift- An Amazon Redshift data source.
Yes
Start timestampThe timestamp of tasks in Matillion, which indicates the amount of metadata that technical lineage via Edge collects.
Specify this field with a UNIX timestamp in milliseconds. The default value is
1, which gets as much history as Matillion provides. Matillion provides 7 days of history by default.
Yes
Source ConfigurationThe source configuration for the data source.
No
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Mysql
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Netezza
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Oracle
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Database links This query retrieves links to other databases. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Materialized views This query retrieves materialized view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Postgres
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Hana
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions. Calculated views This query retrieves calculated views. Dependencies of calculated views This query retrieves dependencies of calculated views. Cross-references of calculated views This query retrieves cross references of calculated views.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
SQL activeAn option to determine whether to include or remove the technical lineage of the data source with the SQL based input.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
No
Calculated views activeAn option to determine whether to include or remove the technical lineage of the data source with the calculated views input.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Important Calculated views are not supported for (Undefined variable: Catalog/data-source-names.sap-hana-full) Cloud. For details, go to Supported data sources for technical lineage.
No
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Snowflake
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Ingestion method The Snowflake ingestion methods that Collibra Data Lineage uses to ingest metadata from Snowflake data sources. Select one of the following values:
- SQL
- The SQL Snowflake ingestion mode. Collibra Data Lineage creates a column-level technical lineage based on SQL statements.
- SQL-API
- The SQL-API Snowflake ingestion mode. Collibra Data Lineage creates a column-level technical lineage based on Snowflake schemas and the access history.
For more information, go to Technical lineage for Snowflake ingestion methods.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.If you select the SQL Snowflake ingestion mode, the following queries apply:
Query Description Columns This query retrieves the columns, tables, schemas, databases or projects fields in the form: database or project > schema > table > column. Views
This query retrieves the view definitions.
If you select the SQL-API Snowflake ingestion mode, the following queries apply:
Query Description Object dependencies
This query retrieves view definitions.
Columns joined
This query retrieves table and column definition information.
Access history
This query retrieves lineage and transformation details.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Mssql
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Database links This query retrieves links to other databases. Synonyms This query retrieves the alternative names for the database objects. Views This query retrieves the view definitions. Other queries This query retrieves other data that technical lineage needs, for example stored procedures, functions, and packages.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for SQL Server Integration Services (SSIS)
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
MaskThe pattern of the file names in the directory. By default, the value is
*.
No
Source ConfigurationThe connection definitions, where you specify relevant translations for each data source. Specify the following properties in JSON format and enter the content in this field.
Connection definitionsProperty
Description ConnStringRegExTranslation
The parent element that opens the connection definitions.
<regular expression>
A regular expression that must match one or more connection strings.
NoteImportant considerations:
- By default, the regular expression is not case sensitive. As a consequence, a regular expression can match with connection strings containing uppercase characters or lowercase characters.
- The connection string is part of the SSIS connection manager.
- SSIS connection managers are included in an SSIS package files (DTSX) or in connection manager files (CONMGR).
ExampleRegular expression:
Server=sb-dhub;User ID=SYB_USER2;Initial Catalog=STAGEDB;Port=6306.*
Explanation: The first section, up to .*, is a literal, but not case-sensitive, match of the characters. The dot (.) can match any single character. The asterisk (*) means zero or more of the previous, in this case any character.
Match: Any connection string that starts withServer=sb-dhub;User ID=SYB_USER2;Initial Catalog=STAGEDB;Port=6306.
Example:Server=sb-dhub;User ID=SYB_USER2;Initial Catalog=STAGEDB;Port=6306;Persist Security Info=True;Auto Translate=False;.dbnameThe name of your database, to which the data source connection refers. schemaThe name of your schema, to which the regular expression refers.
dialectThe dialect of the referenced database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
collibraSystemNameThe name of the referenced data source's system or server.
The system or server names in table references are considered to be represented by different System assets in Data Catalog. The value of this field is used as the default system or server name.
This property is only required when you set the value of the Collibra system name flag setting to
True. Note Specify this property with the same name as the full name of the System asset that you created when you registered the data source.See an example.
{ "ConnStringRegExTranslation": { "Data Source=dhb-sql-prod;Initial Catalog=SFG_repl_staging;Provider=SQLNCLI11;Integrated Security=SSPI.*": { "dbname": "DATAHUB", "schema": "DBO", "dialect": "mssql", "collibraSystemName" : "WAREHOUSE" }, "Server=sb-dhub;User ID=SYS_USER;Initial Catalog=STAGEDB;Port=6306.*": { "dbname": "STAGEDB", "schema": "STAGE_OWNER", "dialect": "sybase", "collibraSystemName" : "" } } }Tip Click
to copy the example to your clipboard.Tip If you previously created a technical lineage for this data source with connection definitions by using the lineage harvester, you can enter the content from the source ID configuration file in this field.
No
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Sybase
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Views This query retrieves the view definitions.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Teradata
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
JDBC connectionThe JDBC connection that you created for Catalog JDBC ingestion.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
External database nameThe database value to be used in the asset path (system -> database -> schema -> table).
No
Database nameThe name of your database.
Tip You can add extra database names by clicking Add property.
Yes
Queries
The queries to download all the data that is required to create technical lineage. The queries vary depending on the data source you use. The query code is automatically available. However, you can modify the query code if needed.
Example Enter the following filter in a Views query:
where v.table_schema not in ('pg_catalog', 'information_schema');. This query excludes the pg_catalog and information_schema schemas, which don't contain customer data. If you want to exclude other schemas, adjust the query to, for examplewhere v.table_schema not in ('pg_catalog', 'information_schema', 'another_schema');.Query Description Columns This query retrieves all columns, tables, schemas, databases or projects in the form: database or project > schema > table > column. Object names This query retrieves a list of object names from which technical lineage can be created. The objects can include stored procedures, views, macros, and so on.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for Custom Lineage
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
Field Description Required? Capability This section contains the general information about the capability.
NameThe name of the Edge capability.
Yes
DescriptionThe description of the Edge capability.
Yes
Capability templateThe capability template, which determines the next available sections.
Select the following Edge capability:
Technical Lineage for SqlDirectory
Yes
Main Properties This section contains the information for creating a technical lineage.
Source IDThe name of the data source. Specify a name that is unique.
Yes
Shared Storage connectionThe Shared Storage connection that you created.
No
MaskThe pattern of the file names in the directory. By default, the value is
*.
Yes
DialectThe dialect of the database.
See the list of allowed values.You can enter one of the following values:
- azure, for an Azure SQL Server data source.
- bigquery, for a Google BigQuery data source.
- db2, for an IBM DB2 data source.
- hana, for a SAP HANA data source.
- hana-cviews, for a SAP HANA data source.Important The
hana-cviewsdialect is supported for SAP HANA (on-premises). It is not supported for SAP HANA Cloud. - hive, for a HiveQL data source.
- greenplum, for a Greenplum data source.
- mssql, for a Microsoft SQL Server data source.
- mysql, for a MySQL data source.
- netezza, for a Netezza data source.
- oracle, for an Oracle data source.
- postgres, for a PostgreSQL data source.
- redshift, for an Amazon Redshift data source.
- snowflake, for a Snowflake data source.
- spark, for a Spark SQL data source.
- sybase, for a Sybase data source.
- teradata, for a Teradata data source.
Yes
Collibra system nameThe system or server name of the data source. This field is also the full name of your System asset in Data Catalog.
The value of this field must be the same as the full name of the System asset that you created when you registered the data source.
Yes
DatabaseThe name of your database.
Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.
Yes
SchemaThe name of the default schema, if not specified in the data source itself. This corresponds to the name of your Schema asset.
Note The database and schema names in the SQL statements in your SQL files take precedence over the values that you provide for the Database and Schema fields in the technical lineage for SqlDirectory capability. If your SQL statements contain database and schema names, Collibra Data Lineage uses them for stitching. If your SQL statements do not contain database and schema names, Collibra Data Lineage uses the values of the Database and Schema fields in the capability for stitching. Fore more information, go to Prepare the SQL directory and Automatic stitching for technical lineage.
Yes
Advanced Properties This section contains the advanced properties for creating a technical lineage.
Delete raw metadata after processingTechnical lineage via Edge harvests metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance, for processing. This option indicates whether or not the source metadata should be deleted after it is processed.
Select this option to indicate that the source metadata is deleted after processing.
Clear the checkbox to keep the source metadata after processing. The metadata is stored in the Collibra infrastructure.
Note Selecting this option can negatively impact performance.
No
Analyze onlyThis option determines whether to only load and analyze the source data on the Collibra Data Lineage service instances.
When you select this option, the technical lineage of the data source is not created during the synchronization of the capability. Selecting this option is equivalent to entering the
load-sourcesandanalyzecommand with a source specified when you use the lineage harvester.This option is not enabled by default.
Use caseYou can use this option to control when to start full synchronization of data sources. For example if you have three data sources, A, B, and C, you can synchronize the data sources as follows:
- During weekdays, synchronize data sources A and B with the Analyze Only checkbox selected in the capabilities. Collibra Data Lineage only loads and analyzes data sources A and B without synchronizing the technical lineage.
- On the weekend, synchronize data source C without selecting the Analyze Only checkbox in the capability. Collibra Data Lineage synchronizes the technical lineage for all data sources including A, B, and C.
No
ActiveThe option determines whether to include or remove the technical lineage of the data source.
Select this option to include the technical lineage of this data source.
Clear the checkbox to exclude the technical lineage of this data source.
Yes
General This section contains general information about logging.
DebugAn option to automatically send Edge infrastructure log files to Collibra Data Intelligence Cloud. By default, this option is set to false.
Note We highly recommend to only send Edge infrastructure log files to Collibra Data Intelligence Cloud when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.
No
Log levelAn option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.
No
- Click Save.
The capability is added to the Edge site.
The fields become read-only.