Code Migration using Product APIs
This document describes the process of migrating Collibra Data Quality code across environments using Product APIs. It supports both pushdown and pullup datasets.
The migration process consists of the following steps:
- Get definitions from the source environment
- Update payloads for the target environment
- Post definitions to target environment
- Run job using the definition
The following sections describe these steps in more detail.
Get definitions from the source environment
Use the following endpoints to get the data that you want to migrate.
Get dataset definitions
GET /v3/datasetDefs/{dataset}
This endpoint returns the dataset definition for the specified dataset.
Get rules
GET /v3/rules
This endpoint returns definitions of all the custom rules associated with the dataset specified.
Update payloads for the target environment
After retrieving the dataset and rules definitions, copy them to a text editor. You can then prepare them for use as payloads in the subsequent POST endpoints.
Dataset payload
In the dataset payload, update the following fields for the target environment:
Field | Description |
---|---|
host | Change to the host name of the target environment. Use the endpoint v2/getremotehost if necessary. |
user | The Collibra DQ user that will be used to run the newly migrated datasets. |
agentId | Change the id and uuid values under agentId to match the target environment. Use the endpoint v2/getagents if necessary. |
lib | Verify that the library specified matches the target environment. |
master | Verify that the master URL (specified under spark ) matches the target environment. |
Rules payload
In the rules payload, verify that the userNM aligns with the target environment. Update the payload for each rule that you want to migrate.
Post definitions to target environment
Use the payloads with following endpoints to migrate code to the target environment.
Post dataset definitions
POST /v3/datasetDefs
Post rules
POST /v3/rules/{dataset}
This endpoint can accept multiple rule definitions (i.e., an array of rules). If the rule already exists, the endpoint will overwrite the rule with the new values.
Run job
After migrating the datasets and rules, use the following endpoint to run the job:
POST /v3/jobs/run
Provide the following parameters:
Parameter | Type | Description |
---|---|---|
dataset | string | Specify the name of the dataset that you created. |
runDate |
string($date-time) |
Specify the run date. |
runDateEnd |
string($date-time) |
Required only for datasets that have a date range. |
agentName | string | Specify the agentName. Use the endpoint v2/getagents if necessary to obtain the agentName. Required only for Pullup datasets. |
You can check the status of the job from the Jobs page.