Code Migration using Product APIs

This document describes the process of migrating Collibra Data Quality code across environments using Product APIs. It supports both pushdown and pullup datasets.

Note You must be logged in to Collibra Data Quality & Observability in order to access the Swagger API. Open Swagger from either the Admin Console or by clicking the

icon in the upper right corner of the page.

The migration process consists of the following steps:

Get definitions from the source environment
Update payloads for the target environment
Post definitions to target environment
Run job using the definition

The following sections describe these steps in more detail.

Get definitions from the source environment

Use the following endpoints to get the data that you want to migrate.

Get dataset definitions

GET /v3/datasetDefs/{dataset}

This endpoint returns the dataset definition for the specified dataset.

Get rules

GET /v3/rules

This endpoint returns definitions of all the custom rules associated with the dataset specified.

Update payloads for the target environment

After retrieving the dataset and rules definitions, copy them to a text editor. You can then prepare them for use as payloads in the subsequent POST endpoints.

Note The following table provides the minimum number of properties to be updated. For a comprehensive list, create a sample dataset in the target environment using the UI and compare the JSON definitions.

Dataset payload

In the dataset payload, update the following fields for the target environment:

Field	Description
host	Change to the host name of the target environment. Use the endpoint `v2/getremotehost` if necessary.
user	The Collibra DQ user that will be used to run the newly migrated datasets.
agentId	Change the id and uuid values under agentId to match the target environment. Use the endpoint `v2/getagents` if necessary.
lib	Verify that the library specified matches the target environment.
master	Verify that the master URL (specified under `spark`) matches the target environment.

Rules payload

In the rules payload, verify that the userNM aligns with the target environment. Update the payload for each rule that you want to migrate.

Post definitions to target environment

Use the payloads with following endpoints to migrate code to the target environment.

Post dataset definitions

POST /v3/datasetDefs

Post rules

POST /v3/rules/{dataset}

This endpoint can accept multiple rule definitions (i.e., an array of rules). If the rule already exists, the endpoint will overwrite the rule with the new values.

Run job

After migrating the datasets and rules, use the following endpoint to run the job:

POST /v3/jobs/run

Provide the following parameters:

Parameter	Type	Description
dataset	string	Specify the name of the dataset that you created.
runDate	string($date-time)	Specify the run date.
runDateEnd	string($date-time)	Required only for datasets that have a date range.
agentName	string	Specify the agentName. Use the endpoint `v2/getagents` if necessary to obtain the agentName. Required only for Pullup datasets.

You can check the status of the job from the Jobs page.