Prepare Looker <source ID> configuration file

The lineage harvester uses the lineage harvester configuration file to collect the Looker data objects and send them to the Collibra Data Lineage service instance.

The <source ID> configuration file allows you to:

  • Filter on the Looker folders from which you want to ingest metadata.
  • If useCollibraSystemName in the lineage harvester configuration file is set to true, use the collibraSystemName property to specify the system name of databases in Looker.
    Collibra Data Lineage uses the system names to match the structure of databases in Looker to assets in Data Catalog.
Example 

Steps

  1. Create a new JSON file in the lineage harvester config folder.
  2. Give the JSON file the same name as the value of the Id property in the lineage harvester configuration file.
    Example The value of the Id property in the lineage harvester configuration file is looker-source-1. As a result, the name of your JSON file should be looker-source-1.conf.
    Important Your JSON file must have the file extension .conf.
  3. For each database in Looker, add the following content to the JSON file:

    Property

    Description

    Mandatory?

    Connections

    This section contains all Looker connections for which you want to create a technical lineage.

    Yes

    <connection name>

    The name of a connection object in Looker.

    Yes

    dialect

    The dialect of the supported data source in Looker.

    No

    schema

    The name of the default schema of a supported data source in Looker.

    If the lineage harvester fails to find a specific schema, it uses the default schema.

    No

    dbname
    The name of the database of a supported data source in Looker.

    No

    collibraSystemName

    The system or server name of a database.

    If you set the useCollibraSystemName property to true in your lineage harvester configuration file, but you either don't create a <source ID> configuration file, or don't specify a value for the collibraSystemName property in your <source ID> configuration file, the system name in the technical lineage is "DEFAULT".

    Yes

    filters

    Optionally, use this section to specify the Looker folders from which you want to ingest metadata.

    Note You can filter on Looker folders, but not on Looker data sets. That's because Looker data sets are linked directly to the server, instead of a folder, as shown in the Looker metadata overview. Looker data sets are ingested in the default domain, regardless of any filtering.

    Let’s say, for example, you filter on folder B. A Looker Folder asset is created in the specified domain in Collibra, and all of the metadata in folder B is ingested. If folder B has a parent folder A, then a Looker Folder asset is created (in the domain specified for folder B) to preserve the hierarchy, but no metadata from folder A is ingested.

    You can specify more than one Looker folder for ingestion into a single domain in Collibra.

    Warning If you don't want to filter on Looker Folders, you must completely remove this filters section.

    Tip 

    You can use wildcards to capture multiple connection string combinations:

    No
    domainId

    The unique resource ID of the domain (or domains), in Collibra, in which you want to ingest data objects from one or more Looker Folders.

    Tip You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>.

     
    description
    Any description, as you see fit. 
    folderNames

    The name (or names) of the Looker Folders from which you want to ingest.

    Note You must specify either a folder name, a folder ID, or both.

     
    folderIds

    The ID (or IDs) of the Looker Folder you want to ingest.

    Note You must specify either a folder ID, a folder name, or both.

     
  4. Save the <source ID> configuration file.

What's next?

See Overview Looker integration steps.