Code Migration using Product APIs

This document describes the process of migrating Collibra Data Quality code across environments using Product APIs. It supports both pushdown and pullup datasets.

Note You must be logged in to Collibra Data Quality & Observability in order to access the Swagger API. Open Swagger from either the Admin Console or by clicking the icon in the upper right corner of the page.

The migration process consists of the following steps:

  1. Get definitions from the source environment
  2. Update payloads for the target environment
  3. Post definitions to target environment
  4. Run job using the definition

The following sections describe these steps in more detail.

Get definitions from the source environment

Use the following endpoints to get the data that you want to migrate.

Get dataset definitions

GET /v3/datasetDefs/{dataset}

This endpoint returns the dataset definition for the specified dataset.

Get rules

GET /v3/rules

This endpoint returns definitions of all the custom rules associated with the dataset specified.

Update payloads for the target environment

After retrieving the dataset and rules definitions, copy them to a text editor. You can then prepare them for use as payloads in the subsequent POST endpoints.

Note The following table provides the minimum number of properties to be updated. For a comprehensive list, create a sample dataset in the target environment using the UI and compare the JSON definitions.

Dataset payload

In the dataset payload, update the following fields for the target environment:

Field Description
host Change to the host name of the target environment. Use the endpoint v2/getremotehost if necessary.
user The Collibra DQ user that will be used to run the newly migrated datasets.
agentId Change the id and uuid values under agentId to match the target environment. Use the endpoint v2/getagents if necessary.
lib Verify that the library specified matches the target environment.
master Verify that the master URL (specified under spark) matches the target environment.

Rules payload

In the rules payload, verify that the userNM aligns with the target environment. Update the payload for each rule that you want to migrate.

Post definitions to target environment

Use the payloads with following endpoints to migrate code to the target environment.

Post dataset definitions

POST /v3/datasetDefs

Post rules

POST /v3/rules/{dataset}

This endpoint can accept multiple rule definitions (i.e., an array of rules). If the rule already exists, the endpoint will overwrite the rule with the new values.

Run job

After migrating the datasets and rules, use the following endpoint to run the job:

POST /v3/jobs/run

Provide the following parameters:

Parameter Type Description
dataset string Specify the name of the dataset that you created.
runDate

string($date-time)

Specify the run date.
runDateEnd

string($date-time)

Required only for datasets that have a date range.
agentName string Specify the agentName. Use the endpoint v2/getagents if necessary to obtain the agentName. Required only for Pullup datasets.

You can check the status of the job from the Jobs page.