Using the Export and Import API
Use the Export and Import API to promote and move datasets across environments. The export/import process supports the following tables:
- rule_repo
- owl_rule
- owl_check_repo
- job_schedule
- alert_cond
- alert_output
- dataset_scoring
- dataset_schema
The export/import process consists of the following steps:
- Add the required stored procedures (prerequisite)
- Export content from source
- Import content into target
The following sections describe these steps in more detail.
Add the required stored procedures (prerequisite)
Add the following stored procedure to the Collibra DQMetastore in the target and source environments. Replace <schema>
with the tenant name. Note that this procedure is supported in CRDB version 23.2.16 and later.
CREATE OR REPLACE FUNCTION <schema>.dump(p_schema text, p_table text, p_where text)
RETURNS SETOF text
LANGUAGE plpgsql
AS $function$
DECLARE
dumpquery_0 text;
dumpquery_1 text;
selquery text;
selvalue text;
valrec record;
colrec record;
BEGIN
-- ------ --
-- GLOBAL --
-- build base INSERT
-- build SELECT array[ ... ]
dumpquery_0 := 'INSERT INTO ' || quote_ident(p_schema) || '.' || quote_ident(p_table) || '(';
selquery := 'SELECT array[';
<<label0>>
FOR colrec IN SELECT table_schema, table_name, column_name, data_type
FROM information_schema.columns
WHERE table_name = p_table and table_schema = p_schema
ORDER BY ordinal_position
LOOP
dumpquery_0 := dumpquery_0 || quote_ident(colrec.column_name) || ',';
selquery := selquery || 'CAST(' || quote_ident(colrec.column_name) || ' AS TEXT),';
END LOOP label0;
dumpquery_0 := substring(dumpquery_0 ,1,length(dumpquery_0)-1) || ')';
dumpquery_0 := dumpquery_0 || ' VALUES (';
selquery := substring(selquery ,1,length(selquery)-1) || '] AS MYARRAY';
selquery := selquery || ' FROM ' ||quote_ident(p_schema)||'.'||quote_ident(p_table);
selquery := selquery || ' WHERE '||p_where;
-- GLOBAL --
-- ------ --
-- ----------- --
-- SELECT LOOP --
-- execute SELECT built and loop on each row
<<label1>>
FOR valrec IN EXECUTE selquery
LOOP
dumpquery_1 := '';
IF not found THEN
EXIT ;
END IF;
-- ----------- --
-- LOOP ARRAY (EACH FIELDS) --
<<label2>>
FOREACH selvalue in ARRAY valrec.MYARRAY
LOOP
IF selvalue IS NULL
THEN selvalue := 'NULL';
ELSE selvalue := quote_literal(selvalue);
END IF;
dumpquery_1 := dumpquery_1 || selvalue || ',';
END LOOP label2;
dumpquery_1 := substring(dumpquery_1 ,1,length(dumpquery_1)-1) || ');';
-- LOOP ARRAY (EACH FIELD) --
-- ----------- --
-- debug: RETURN NEXT dumpquery_0 || dumpquery_1 || ' --' || selquery;
-- debug: RETURN NEXT selquery;
RETURN NEXT dumpquery_0 || dumpquery_1;
END LOOP label1 ;
-- SELECT LOOP --
-- ----------- --
RETURN ;
END
$function$
;
Add the following assignment:
alter function dump(text, text, text) owner to <ownername>;
Export content from source
Export content from the source schema. Use the following endpoint:
GET /v2/db-export
This endpoint is available in Swagger under the UI Internal definition.
Parameters
Parameter | Type | Description |
---|---|---|
datasets * |
List of strings | List of datasets to export. You need to give at least one valid dataset name into the list. |
schema
|
String | Name of the schema/tenant where you want to perform the export. \ Default value: public |
tables
|
String | List of tables to export on the given schema & dataset(s). \ If you leave it empty, the following tables will be exported altogether: rule_repo, owl_catalog, owl_rule, alert_cond, owl_check_repo, job_schedule, alert_output |
Response
The endpoint returns 200:OK if the export was successful. The response includes a list of SQL INSERT statements in JSON format. Copy this response. It is required for the next step in the process.
Import content into target
Import content into the target environment. Use the following endpoint:
POST /v2/db-import
This endpoint is available in Swagger under the UI Internal definition.
Request body
In the request body, paste in the JSON response from the export endpoint (/v2/db-export
). Be sure to check the JSON parameters for differences in connections, agents, spark and environment configurations. Migrating to different environments typically requires the payload to be modified.
If importing into the same environment, edit the payload to avoid key constraint conflicts.
Note The import is rung on non-transactional mode, any error happens in the middle, the saved items will be left in the database as is.
Response
The endpoint returns 200:OK if the import was successful. At this point, you may open the user interface and verify that the rules were successfully transferred.