Tableau hostname, schema, and system name mapping

To achieve end-to-end lineage and stitching, Collibra Data Lineage must match the full names of data objects in a technical lineage and the full names of their corresponding assets in Data Catalog. However, there are several situations that can impede full-name matching. In such cases, you can include a hostnameMapping section in your Tableau <source ID> configuration file, to map the database, schema or system names that were returned by the Tableau APIs to the actual names of the assets in Data Catalog.

Tip "Mapping" means changing the full name of data objects as they appear in a technical lineage, so that they match the full names of their corresponding assets in Data Catalog.

The following example scenarios can impede full-name matching:

  • Tableau can't derive the schema name. In this case, the schema name in the technical lineage is DEFAULT.
  • You have schema-less external data sources, such as HiveQL, MySQL or Teradata. In this case, the database name in the technical lineage is also the schema name.
  • You have a data access layer between Tableau and your external data source. In this case, Tableau might incorrectly interpret the data access layer as the database name, and the data source as the schema.
  • You have data sources that are created based on tables from other data sources in Tableau. These data sources do not have schemas.
  • The Tableau APIs returned a technical database or server name that is different than the real name of the database or server.
Warning 
  • hostnameMapping replaces the following deprecated properties:
    • The databaseMapping property.
    • The databases sub-section of the collibraSystemNames section.

hostnameMapping must not be used in combination with either of these properties.

For descriptions of these properties, go to the Tableau section in the Prepare a <source ID> configuration file topic.

If you use the hostnameMapping section, you can still use the collibraSystemName property in conjunction with the files, connectors or cloudfiles sub-sections.

Example configurations

  • The following configuration:
    • Changes the found database name "Test" to "CData".
    • Changes the found schema name “DEFAULT” to “Jan_1_2022”.
    • Adds the Collibra system name "TV_testing".
      Important The system name must match the name you specified for the id property in the lineage harvester configuration file, including for case-sensitivity.
    "hostnameMapping": {
      "found_dbname=Test;found_hostname=*;found_schema=DEFAULT": {
            "dbname": "CData",
            "schema": "Jan_1_2022",
            "dialect": "spark",
            "collibraSystemName": "TV_testing"
            }
        }
  • The following configuration:
    • For all found databases on the host "abc.net", changes their names to "CData".
    • Changes the found schema name “DEFAULT” to “Jan_1_2022”.
  • "hostnameMapping": {
    	"found_dbname=*;found_hostname=abc.net;found_schema=DEFAULT": {
    		"dbname": "CData",
    		"schema": "Jan_1_2022",
    		"dialect": "spark",
    		}
    	}
  • The following configuration:
    • Changes the found database name "Test" to "CData" .
    • Changes the found schema name “DEFAULT” to “Jan_1_2022”.
    "hostnameMapping": {
      "found_dbname=Test;found_hostname=*;found_schema=DEFAULT": {
            "dbname": "CData",
            "schema": "Jan_1_2022",
            "dialect": "spark",
            }
        }
  • The following configuration:
    • Changes the found database name "Test" to "CData".
    "hostnameMapping": {
      "found_dbname=Test;found_hostname=*;found_schema=DEFAULT": {
            "dbname": "CData",
            }
        }