Add the Tableau technical lineage capability

Name

The name of the capability.

Yes

Description

The description of the capability.

No

Source ID

The name of the data source. The name must be unique and cannot contain special characters, for example, /.

Warning

You can only specify one source ID per Tableau server or Tableau online account. Ingesting the same Tableau server or Tableau online account under different source IDs will fail.
Any single Tableau server or Tableau online account can be ingested only once. If you create more than one connection for the same Tableau server or Tableau online account, integration will fail. If you want to ingest from multiple unique Tableau server or Tableau online accounts, you have to create a new Edge connection for each one, configure a new capability template for each one, and each must have a unique source ID.

Warning If you are switching between the lineage harvester and Edge, the value in this field must exactly match the value of the id property in your lineage harvester configuration file.

Yes

TechLin Admin Connection (in preview)

If you want to use the OAuth authentication type to connect to the Collibra Data Lineage service instances, you have to create a Technical Lineage Admin Edge or Collibra Cloud site connection and select the OAuth authentication type. Then, in this field, specify the name of the Technical Lineage Admin connection.

For more information about the authentication types, go to Create a Technical Lineage Admin connection.

No

Tableau connection

The Tableau connection that you created for ingestion in Data Catalog.

Tip Select the name that you provided in the Name field when you created a connection to Tableau.

Yes

Domain ID

The unique reference ID of the domain in Collibra Platform in which you want to ingest the Tableau assets.

Yes

REST only

Indication whether or not you want to use both the Tableau REST API and Tableau Metadata API to harvest Tableau metadata.

Cleared: The lineage harvester will use the REST API and Metadata API to harvest Tableau metadata.
Selected (default): The lineage harvester will only use the REST API to harvest Tableau metadata.

Note This field must be cleared, to:

Enable technical lineage and the automatic stitching of Column assets to Tableau Data Attribute assets.
Harvest owner information for Tableau projects, workbooks and data models.

No

Exclude images

Indication whether or not you want to excluding the downloading of images.

Cleared: Images are downloaded.
Selected (default): Images are not downloaded.

Note The maximum number of images that can be uploaded to Collibra per day is determined by the configuration of the file upload service, in Collibra Console. For complete details, see the Upload configuration settings in DGC service configuration: options.

No

Site ID

The site IDs of the Tableau sites that you want to include in the ingestion process.

To ingest from multiple Tableau sites, enter each site ID in a separate Site ID field.

To ingest the default Tableau site, enter "Default" or leave the field empty. This field is not case sensitive.

Warning If you enter "Default", you must include the double quotation marks. The site IDs of any other Tableau sites must not be enclosed in double quotation marks. If the formatting of the site IDs does not conform to this detail, ingestion will fail.

Example

Tip Ensure that you specify the correct value. The correct value is the URL of the site to which you want to sign in. When you manually sign in to Tableau Server or Tableau Online, the site ID is the value that appears after /site/ in the browser address bar. In the following example URLs, the site ID is MarketingTeam:

Tableau Server: http://MyServer/#/site/MarketingTeam/projects
Tableau Online: https://10ay.online.tableau.com/#/site/MarketingTeam/workbooks

On Tableau Server, however, the URL of the default site does not specify the site. For example, the URL for a view named Profits, on a site named Sales, is http://localhost/#/site/sales/views/profits. The URL for this same view on the default site is http://localhost/#/views/profits. The site name Sales does not figure in the URL.

Yes

Site Name

The site name, or names, of the Tableau sites you specified in the Site ID field.

If you don't provide a site ID in the Site ID field, or if you enter "Default", leave this field empty.

You must enter a name for every site ID you enter.

Concurrency level

This field is intended to help if you are experiencing HTTP 401 Unauthorized errors due to too many concurrent HTTP calls, using the same token. It allows you to specify the internal sizing, meaning the amount of tasks that can be executed at the same time.

The default value is 10, meaning as many as 10 HTTP requests can take place in parallel. Consider reducing the value if you are experiencing HTTP 401 Unauthorized errors. Setting the value to 1 effectively disables the concurrency level, so that HTTP requests will be run in a synchronous manner, instead of in parallel.

No

Source configuration

This field allows you to provide JSON code for database mapping, domain mapping and filtering.

This field has a size limit. If your JSON content exceeds 256 KB, do not use this field. Instead, use the Source Configuration File field to prevent the synchronization job from failing.

If you previously integrated Tableau via the lineage harvester, you can copy and paste in this field the JSON code from your Tableau <source ID> configuration file.

Connection definition properties

Property	Description	Mandatory?
collibraSystemNames	This section contains the system information for different Tableau data sources. Depending on the kind of data source or connection, you have to specify how to connect to this data source. For more information, see the Tableau documentation. We also recommend to check the list of supported connectors in Tableau.	No
files	This section contains connection information to one or more files in Tableau. If you do not have files in Tableau, you can remove this section.	No
filePath	The full path to the file. For example, the path to a JSON file.	No
collibraSystemName	The system name of the file.	No
connectors	This section contains connection information to one or more connectors in Tableau. If you do not have connectors in Tableau, you can remove this section. The values that you specify for this property are not case-sensitive.	No
connectorUrl	The URL of the connector. For example, the URL to Google Analytics.	No
collibraSystemName	The system name of the connector.	No
cloudFiles	This section contains connection information to one or more cloud files in Tableau's input data. If you do not have cloud files in Tableau, you can remove this section.	No
name	The name of the file. For example, the name of a Zendesk file.	No
collibraSystemName	The system name of the cloud file.	No
hostnameMapping	This section allows you to map Tableau technical database, server and schema names to the respective real names, to preserve stitching. Warning `hostnameMapping` replaces the following deprecated properties, which have been removed from this topic: The `databaseMapping` property. The `databases` sub-section of the `collibraSystemNames` section. `hostnameMapping` must not be used in combination with either of these properties. If you use the `hostnameMapping` section, you can still use the `collibraSystemName` property in conjunction with the `files`, `connectors` or `cloudfiles` sub-sections. Example configuration Copy `"hostnameMapping": { "found_dbname=databasename1;found_hostname=*;found_schema=test": { "dbname": "mssql-database-name", "schema": "mssql-schema-name", "dialect": "mssql", "collibraSystemName": "mssql-system-name" } }` For more example configurations, go to Tableau hostname, schema, and system name mapping.	No
found_dbname=<database name>;found_hostname=<server name>;found_schema=<schema name>	The database information of supported data sources in Tableau that is typically collected by the lineage harvester. It allows you to specify the name of the database (found_dbname), on which server a database is running (found_hostname), and optionally, the name of the schema (found_schema).	No
dbname	The name of the database of a supported data source in Tableau.	No
schema	The name of the default schema of a supported data source in Tableau. If the lineage harvester fails to find a specific schema, it uses the default schema.	No
dialect	The dialect of the supported data source in Tableau. You don't have to specify a dialect; it will automatically be detected. If, however, you are using a dialect that is not supported, you can use this property to specify a supported dialect that is a close comparison. That way, most of your queries will be detected and processed. Show a list of dialects of supported data sources in Tableau. redshift, for an Amazon Redshift data source. azure, for an Azure SQL Server data source. bigquery, for a Google BigQuery data source. greenplum, for a Greenplum data source. hive, for a HiveQL data source. oracle, for an Oracle data source. postgres, for a PostgreSQL data source. mssql, for a Microsoft SQL Server data source. mysql, for a MySQL data source. netezza, for a Netezza data source. hana, for a SAP HANA data source. spark, for a Spark SQL data source. sybase, for a Sybase data source. teradata, for a Teradata data source.	No
filters	This section defines: From which Tableau sites and projects you want to harvest metadata. Into which domains in Collibra you want to ingest the corresponding assets. Filtering is transitive, which means that all resources in a specified project, such as Tableau workbooks and all sub-projects, are ingested. Tableau assets that are not mapped to the specified domains, for example the Tableau Server assets and the parent projects (if you specify their sub-projects), are ingested in the default domain. Important Filtering does not affect the amount of raw metadata that is harvested from Tableau and sent to the Collibra Data Lineage service instance. Rather, it determines which metadata is ingested as assets in Data Catalog. The `domainMapping` and `filters` sections are mutually exclusive. Do not include both `domainMapping` and `filters` sections in your JSON file. Tip If you want to ingest all of the projects in a Tableau site into multiple domains in Collibra, use the `domainMapping` section. If you want to ingest all of the projects in a Tableau site into the default domain, use only the `domainID` property in the lineage harvester configuration file. The `domainID` property represents the default domain. If you want to ingest all of the projects in a Tableau site into a single domain in Collibra, use site filtering. If you want to ingest metadata from only some of the projects in a Tableau site, use project filtering. You can use site filtering and project filtering together: If filtering on the same site, this "filtering" is actually domain mapping, because nothing is filtered out. The contents of the projects are ingested in the specified domains, and the rest of the contents of the site are ingested in a different, specified domain. If you are site filtering on a specific site and project filtering a different site, then site filtering is again a form of domain mapping, and the filtered projects are ingested in their specified domains. If your lineage harvester configuration file includes sites that are not mentioned in the `filters` section of your <source ID> configuration file, those sites are ingested in the default domain.	No
sites	The Tableau sites to be ingested and the domain in which you want to ingest metadata from the Tableau sites. If you have only one Tableau site, do not include a `sites` section in your <source ID> file. Instead, use a `projects` section, to filter on Tableau projects. Include a `sites` section only if all of the following are true: You have more than one Tableau site. You want to ingest all of the metadata from only one Tableau site into a single domain in Collibra. The domain into which you want to ingest is not the default domain, meaning the domain specified in the `domainId` property in your lineage harvester configuration file.	No
site_name: domain_id	`site_name` The name of the site to be ingested. The site name is case-sensitive. `domain_id` The unique reference ID of the domain in Collibra in which you want to ingest metadata. The domain ID is case-sensitive. To ingest all metadata from a Tableau site in the specified domain, specify the site name and a separate domain ID for each site that you list on the `siteIds` property in the lineage harvester configuration file for Tableau. If the `site_name` or `domain_id` property is not specified for a site, the metadata from the site is ingested in the default domain. To find a domain reference ID, open the relevant domain in Collibra. The URL looks like: https://<yourcollibrainstance>/domain/22258f64-40b6-4b16-9c08-c95f8ec0da26?view=00000000-0000-0000-0000-000000040001. In this example, the reference ID is in bold. Example configuration Copy `{ "filters":{ "sites":{ "Training":"ca60b822-781b-4b3a-b44d-f65bd107ff92" }, "projects":{ "Testing > Databricks":"e8f4e4a8-4062-4a33-9b44-3ce3e18e4e22", "Product Demo > Customer Insights":"a305e6f7-7a49-49aa-aa85-41b1e689121b" } } }`	No
projects	The Tableau projects to be ingested and the domain in which you want to ingest metadata from the Tableau projects or sub-projects. Project filtering is not relevant for those who have an Explorer role in Tableau, because Explorers need to configure permissions for each data object in Tableau that they want to ingest. As the Administrator role has access to all data objects, project filtering allows Administrators to specify which projects to ingest.	No
site_name > project_name : domain_id	The `site_name` should be the Tableau site name. The `project_name` should be the Tableau project name. The `domain_id` should be the unique reference ID of the domain in Collibra in which you want to ingest metadata. When you specify the site and project names, the following rules apply: Add spaces before and after >. The spaces are separators between the site and project. Specify the full exact site and project names. The values are case-sensitive. When you specify a Tableau project, all assets in the project are ingested in the specified domain. If you want to ingest assets from different Tableau projects in one domain, you can specify the same value for `domain id` for different projects. Example configuration Copy `"Collibra_tab_partner_site > JB_Test_2812": "d224a1a5-43b4-43b2-8df0-ddf8f2726b82"`	No
site_name > project_name > sub-project_name : domain_id	The `site_name` should be the Tableau site name. The `project_name` should be the Tableau project name. Optionally, use `sub-project_name` to specify the Tableau sub-project name. The `domain_id` property should be the unique reference ID of the domain in Collibra in which you want to ingest metadata. When you specify the site, project and sub-project names, the following rules apply: Add spaces before and after >. The spaces are separators between the site and project. Specify the full exact site and project names. The values are case-sensitive. Example Copy `"Collibra_tab_partner_site > JB_Test_2812 > ProjectJJ2": "d224a1a5-43b4-43b2-8df0-ddf8f2726b82"`	No
domainMapping	This section defines in which domains in Collibra you want to ingest assets from your Tableau sites and Tableau projects. Domain mapping is transitive, meaning that all resources, such as Tableau workbooks and data attributes in a parent Tableau site, project or sub-project, are ingested in the same domain as the parent. Important The `domainMapping` and `filters` sections are mutually exclusive. Do not include both `domainMapping` and `filters` sections in your JSON file. Tip If you want to ingest all of the projects in a Tableau site into multiple domains in Collibra, use this `domainMapping` section. If you want to ingest all of the projects in a Tableau site into the default domain, use only the `domainID` property in the lineage harvester configuration file. The `domainID` property represents the default domain. Note Tableau assets that are not mapped to specific domains via this `domainMapping` section, for example Tableau Server assets, are ingested in that default domain. If you want to ingest all of the projects in a Tableau site into a single domain in Collibra, use site filtering. If you want to ingest metadata from only some of the projects in a Tableau site, use project filtering. Example Let's say that you have a Tableau site named "Site-1". You want to ingest all Tableau projects in "Site-1" in a domain named "Domain-1" in Collibra, with the exception of one Tableau project named "Project-Default", which you want to ingest in "Domain-2". You should configure the `domainMapping` section as follows. Copy `"domainMapping": { "<Site-1>": "reference-id-of-Domain-1", "<Site-1> > <Project-Default>": "reference-id-of-Domain-2" }` If you want to specify a domain for a sub-project of "Project-Default", use the `<site name> > <project name> > <sub-project name>` property, as described below. For the properties in this `domainMapping` section, ensure that you maintain the spaces before and after "`>`", for example `"Site-1 > Project-Default"`. The spaces serve as a separator between the site and the projects.	No
site name	The Tableau site name, followed by the unique reference ID of the domain in Collibra in which you want to ingest resources from the Tableau site. In the configuration file, use the actual site name, along with the domain reference ID, for example: `"Collibra_tab_partner_site": "afc8cfb0-91f1-4075-a3e5-7ce6d1f9bcc9"`	No
site name > project name	The Tableau project name, preceded by the name of the Tableau site to which it belongs, and followed by the unique reference ID of the domain in Collibra in which you want to ingest resources from the Tableau project. In the configuration file, use the actual site and project names, along with the domain reference ID, for example: `"Collibra_tab_partner_site > JB_Test_2812": "d224a1a5-43b4-43b2-8df0-ddf8f2726b82"`	No
site name > project name > sub-project name	The Tableau sub-project name, preceded by the name of the Tableau site and project to which it belongs, and followed by the unique reference ID of the domain in Collibra in which you want to ingest resources from the Tableau sub-project. In the configuration file, use the actual site, project and sub-project names, along with the domain reference ID, for example: `"Collibra_tab_partner_site > JB_Test_2812 > ProjectJJ2": "d224a1a5-43b4-43b2-8df0-ddf8f2726b82"`	No

Example

No

Source Configuration File

An alternative to the Source Configuration field. Upload a .json file that contains your source configuration.

This file is required if your JSON content exceeds 256 KB, because large JSON strings provided in the Source Configuration field can cause the synchronization job to fail.

No

Property

This section contains the custom parameters you can specify to create technical lineage. Click Add property to add a property.

You can use this field to set the HTTP timeout duration by adding the httpTimeout property:

Warning If you are a Collibra Platform for Government customer, this field is required to connect to a Collibra Data Lineage service instance:

Properties for Collibra Platform for Government customers

Type Value Type Name Value

Text

Plaintext

techlinHost

This is the URL of the Collibra Data Lineage service instance to which you want to upload metadata, for example techlin-europe-west1.collibra.com.

Example: techlin-europe-west1.collibra.com

Text

Secret

techlinKey

This is the unique API key to connect to a Collibra Data Lineage service instance.

Specify a unique user key for each Collibra environment. If you're not sure what your user key is, contact your Collibra Collibra Account Team.

<your-techlin-key>

Yes for US government customers.

Dependent On Sources

This option allows you to provide table-definition details from an independent data source to a data source that is dependent on those details. This is needed to avoid analysis errors and to have a complete lineage that includes lineage from the SQL statements from dependent data sources.

To use this option, enter the source ID of the independent source.

Important If a dependent data source contains lowercase column names, this feature will only work for the following dialects: Oracle, Snowflake, and Teradata. For all other dialects:

An analyze error is raised, prompting you to provide the DDL file.
The only workaround is to consolidate your SQL statements and DDL file in a single data source.

For complete information, go to Sharing database models across data sources.

No

Delete Raw Metadata After Processing

Technical lineage via Edge harvests raw metadata from specified data sources and uploads it in a ZIP file to a Collibra Data Lineage service instance. This option indicates whether the raw metadata should be deleted from the Collibra Data Lineage service instance after the metadata that is targeted for ingestion in Data Catalog is processed.

Select this option to indicate that the raw source metadata is deleted after processing.

Clear the checkbox to keep the raw source metadata after processing. In this case, it is stored in the Collibra infrastructure.

Analyze Only (Deprecated)

Important This option is deprecated and will be removed in a future version of Collibra. We recommend that you no longer use it. The mandatory Processing Level setting, below, replaces this option.

The "Analyze" option in the Processing Level setting is the equivalent of selecting the Analyze Only option.
The "Sync" option in the Processing Level setting is the equivalent of clearing the Analyze Only option.

No

Processing Level

Important This setting replaces the deprecated Analyze Only option, which will be removed in a future version of Collibra.

For each of your data sources, you have to specify one of the following values: Load, Analyze, or Sync. Then, when you synchronize your technical lineage, the following process begins:

Metadata for all data sources is loaded, regardless of the value of this setting for a particular data source.
Metadata from data sources for which the value of this setting is either Analyze or Sync, is analyzed.
Metadata from data sources for which the value of this setting is Sync, is synchronized.

Value Description

Load

Harvest metadata from the data source and upload it to your Collibra environment. This allows you to inspect and, if necessary, edit the harvested metadata before uploading it to the Collibra Data Lineage service instance for analysis.

When the job is done, you can download and review the metadata:

Open the Activities list.
In the row containing the job, click Result.
The Synchronization Results dialog box appears.
Click download and save the ZIP file to your hard drive.

Tip The download link resembles the following: https://integrations.collibra-abc.com/rest/2.0/files/01944f12-7665-7d9c-8bc5-aa426b6a63cc. Take note of the file ID, in this example: 01944f12-7665-7d9c-8bc5-aa426b6a63cc. After you inspect the metadata, you can send the ZIP file for analysis by using the "Analyze files" option. Alternatively, you can upload the ZIP file using the POST /files API. In either case, you need to specify the file ID.

Analyze

Load and analyze the metadata on the Collibra Data Lineage service instance.

Synchronization does not start after analysis; it starts only after either:

You trigger synchronization of another data source for which you specify Sync in the Processing Level drop-down list.
You configure the Technical Lineage Admin Edge or Collibra Cloud site capability, and trigger synchronization via the Sync option in the Integration Configuration tab in Data Catalog.

Important If you want to synchronize multiple data sources, we strongly recommend that you select this option in the respective Edge or Collibra Cloud site capabilities for each of your data sources. This allows you to synchronize all data sources in a single job, thereby maximizing efficiency and mitigating the risk of failed synchronization jobs.

For complete information and important considerations, go to Tips for successful lineage synchronization
For more information about the Sync option in the Technical Lineage Admin Edge or Collibra Cloud site capability, go to Technical lineage admin options.

Sync

Load, analyze, and synchronize metadata from all data sources. Synchronization starts – or is queued, if another synchronization job is running – immediately after analysis.

Important If you want to synchronize multiple data sources and you select this option, each data source is processed as a separate job. This is highly inefficient and will likely lead to failed sync jobs. For complete information and important considerations, go to Tips for successful lineage synchronization.

Yes

Active

The option determines whether to include or remove the technical lineage of the data source.

Select this option to include the technical lineage of this data source.

Clear the checkbox to exclude the technical lineage of this data source.

No

Paging

This option allows you to customize the Tableau API pagination settings.

The default values are sufficient in most cases; however, you can decrease them to help mitigate node limit errors, or increase them to speed up API calls.

If the integration fails because of timeout errors due to page sizing limits, Collibra Data Lineage automatically adjusts the limits and retries. For example, if failure occurs with worksheetsPageSize set to 100, the value is automatically reduced to 50 and another integration attempt is automatically started. If it fails again, the value is again halved. If integration is still unsuccessful with an adjusted value of 1, an error is thrown and no further attempts are started. If integration is eventually successful, the page size value is restored to its original value, in this example 100, for the next synchronization.

The complete list of pagination settings, descriptions and default values

"paging": {
	"databasesPageSize": 100,
	"tablesPageSize": 100,
	"tablesColumnsPageSize": 100,
	"tableColumnsPageSize": 1000,
	"datasourcesPageSize": 50,
	"datasourcesFieldsPageSize": 50,
	"datasourceFieldsPageSize": 100,
	"worksheetsPageSize": 100,
	"worksheetsFieldsPageSize": 100,
	"worksheetFieldsPageSize": 1000,
	"usersPageSize": 100,
	"dashboardsPageSize": 100,
	"columnsLimit": 20,
	"fieldsLimit": 20
	}

Settings per metadata type and descriptions

Metadata type	Setting and description
Dashboard	`dashboardsPageSize`: The number of dashboards per page.
Worksheet	`worksheetsPageSize`: The number of worksheets per page. `worksheetsFieldsPageSize`: The number of worksheet fields per page.
Database	`databasesPageSize`: The number of databases per page.
Table	`tablesPageSize`: The number of tables per page. `tablesColumnsPageSize`: The number of table columns per page.
Table columns	`tableColumnsPageSize`: The number of table columns per page.
Users	`usersPageSize`: The number of users per page.
Data source	`datasourcesPageSize`: The number of data sources per page. `datasourcesFieldsPageSize`: The number of data source fields per page. `columnsLimit`: The number of data source field columns per page. `fieldsLimit` : The number of referenced data source fields per page.
Data source field	`datasourceFieldsPageSize`: The number of data source fields per page. `columnsLimit`: The number of data source field columns per page. `fieldsLimit` : The number of referenced data source fields per page.

No

Debug

This setting is not valid for this integration. It should be set to false.

No

Log level

Only complete this field on the request of or together with Collibra Support.

No

Add the Tableau technical lineage capability

Required permissions

Steps

What's next