Prepare Looker <source ID> configuration file

The lineage harvester uses the lineage harvester configuration file to collect the Looker data objects and send them to the Collibra Data Lineage service instance.

The <source ID> configuration file allows you to:

Filter on the Looker folders from which you want to ingest metadata.
If useCollibraSystemName in the lineage harvester configuration file is set to true, use the collibraSystemName property to specify the system name of databases in Looker.
Collibra Data Lineage uses the system names to match the structure of databases in Looker to assets in Data Catalog.

Example

Steps

Create a new JSON file in the lineage harvester config folder.
Give the JSON file the same name as the value of the Id property in the lineage harvester configuration file.
Example The value of the Id property in the lineage harvester configuration file is looker-source-1. As a result, the name of your JSON file should be looker-source-1.conf.
Important Your JSON file must have the file extension .conf.

For each database in Looker, add the following content to the JSON file:

Property

Description

Mandatory?

Connections

This section contains all Looker connections for which you want to create a technical lineage.

Yes

The name of a connection object in Looker.

Yes

dialect

The dialect of the supported data source in Looker.

schema

The name of the default schema of a supported data source in Looker.

If the lineage harvester fails to find a specific schema, it uses the default schema.

dbname

The name of the database of a supported data source in Looker.

collibraSystemName

The system or server name of a database.

If you set the useCollibraSystemName property to true in your lineage harvester configuration file, but you either don't create a <source ID> configuration file, or don't specify a value for the collibraSystemName property in your <source ID> configuration file, the system name in the technical lineage is "DEFAULT".

Yes

filters

Optionally, use this section to specify the Looker folders from which you want to ingest metadata.

Note You can filter on Looker folders, but not on Looker data sets. That's because Looker data sets are linked directly to the server, instead of a folder, as shown in the Looker metadata overview. Looker data sets are ingested in the default domain, regardless of any filtering.

Let’s say, for example, you filter on folder B. A Looker Folder asset is created in the specified domain in Collibra, and all of the metadata in folder B is ingested. If folder B has a parent folder A, then a Looker Folder asset is created (in the domain specified for folder B) to preserve the hierarchy, but no metadata from folder A is ingested.

You can specify more than one Looker folder for ingestion into a single domain in Collibra.

Warning If you don't want to filter on Looker Folders, you must completely remove this filters section.

Tip

You can use wildcards to capture multiple connection string combinations:

domainId

The unique resource ID of the domain (or domains), in Collibra, in which you want to ingest data objects from one or more Looker Folders.

Tip You can find the domain ID by clicking the domain type. Then look in the URL of your browser to find the ID. The URL looks like https://<yourcollibrainstance>/domain/<domain ID>?<view>.

description

Any description, as you see fit.

folderNames

The name (or names) of the Looker Folders from which you want to ingest.

Note You must specify either a folder name, a folder ID, or both.

folderIds

The ID (or IDs) of the Looker Folder you want to ingest.

Note You must specify either a folder ID, a folder name, or both.

Save the <source ID> configuration file.

What's next?

See Overview Looker integration steps.