Synchronized data via Google Dataplex ingestion

After the synchronization, assets become available in Collibra.

If no domain mappings are provided, the GCP Project asset and Lake assets are created in the same domain as the System asset.
Zone assets with all corresponding Files, Tables, Columns assets are created in new domains that use the following naming convention: {project name}>{lake name}>{zone name}. If you are integrating Google Dataplex Catalog with Google BigQuery, the Schema assets are added in the same domains as Zone assets.

If you registered the BigQuery data source via the BigQuery JDBC connector, and then integrate Google Dataplex Catalog, assets will be ingested in the same domains that were registered during JDBC ingestion. Specifically, Project assets are registered in the Database domains, and Zone assets are registered in the Schema domains. The mapping created by JDBC ingestion takes priority over the configurations in the Google Dataplex Catalog synchronization capability. In this way, no duplicated tables or columns are created. For more information, go to Ways to work with Google Cloud Platform (GCP).

Warning Do not move the assets to another domain. Doing so may lead to errors during future synchronizations.

Tip Google Dataplex Catalog synchronization relies on UUIDs.

By default, the assets get the Implemented status.

The operating model for Google Dataplex Catalog with Google BigQuery includes the Schema asset type, whereas the operating models for Google Dataplex Catalog with Google Cloud Storage (GCS) assets do no. There are also two operating models for Google Dataplex Catalog with GCS assets; one includes the GCS bucket, and the other does not. Therefore, the synchronization results and Edge naming conventions differ as shown in the following information.
By default, the resulting assets are shown in a plain list, but you can enable a multi-path hierarchy to show it in a tree structure.

For the best result, we recommend that you use the following relations:

  1. GCP Project groups Dataplex Lake
  2. Dataplex Lake contains Dataplex Zone
  3. Dataplex Zone contains Table
  4. Table contains Column
Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

The following images shows the resulting hierarchical table:

Edge naming conventions for created GCS assets without GCS Bucket

The assets created when integrating Dataplex receive a unique full name (fully qualifying name) based on the following naming convention:

Asset type Naming convention Example
GCP Project systemAssetID>projectID
915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer
Dataplex Location systemAssetID>projectID>location 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1
Dataplex Lake systemAssetID>projectID>location>lakeID 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1
Dataplex Zone systemAssetID>projectID>location>lakeID>zoneID 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1>vb_mapping_zone_1
Table systemAssetID>projectID>location>lakeID>zoneID>tableName 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1>vb_mapping_zone_1>zone_1
Column systemAssetID>projectID>location>lakeID>zoneID>tableName>columnName(column) 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1>vb_mapping_zone_1>zone_1>_c0(column)
Note systemAssetID is to be deprecated in a future release.

You can view the full name of an asset by editing the asset.

Warning Don't edit the full name of assets because the name is needed to synchronize or refresh data sources. Changing the full name may cause unexpected results and break the synchronization or refresh process.

Synchronized metadata per GCS asset type

This table shows the metadata for each Google Dataplex Catalog asset type. If you do not see any of the listed synchronized metadata, you can add characteristics to the layout on the asset type page.

Asset type Synchronized metadata Resource ID
GCP Project

Technology Asset groups / is grouped by Technology Asset

00000000-0000-0000-0000-00000000705
Dataplex Lake Labels 00000000-0000-0000-0001-002600000003
Lake Id 00000000-0000-0000-0001-002600000004
Location 00000000-0000-0000-0000-000000000203
Lake Status 00000000-0000-0000-0001-002600000005
Description from source system 00000000-0000-0000-0001-000500000074
URL 00000000-0000-0000-0000-000000000258
GCP Project groups / is grouped by Dataplex Lake 00000000-0000-0000-0001-002700000000
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog.  
Dataplex Zone Labels 00000000-0000-0000-0001-002600000003
Zone Name 00000000-0000-0000-0001-002600000007
Location 00000000-0000-0000-0000-000000000203
Description from source system 00000000-0000-0000-0001-000500000074
URL 00000000-0000-0000-0000-000000000258
Any extensible properties defined via the capability.  
Dataplex Lake contains / is part of Dataplex Zone 00000000-0000-0000-0001-002700000001
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog.  
Table Any extensible properties defined via the capability.  

Schema contains / is part of Table

00000000-0000-0000-0001-002700000002
Description from source system 00000000-0000-0000-0001-000500000074
Column

Technical Data Type

00000000-0000-0000-0000-000000000219

Column Position

00000000-0000-0000-0001-000500000020

Is Nullable

00000000-0000-0000-0001-000500000011

Column is part of / contains Table

00000000-0000-0000-0000-000000007042
Description from source system 00000000-0000-0000-0001-000500000074

For the best result, we recommend that you use the following relations:

  1. GCP Project groups Dataplex Lake
  2. Dataplex Lake contains Dataplex Zone
  3. Dataplex Zone contains GCS Bucket
  4. GCS Bucket contains Table
  5. Table contains Column
Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

The following images shows the resulting hierarchical table:

Edge naming conventions for created GCS assets with GCS Bucket

The assets created when integrating Dataplex receive a unique full name (fully qualifying name) based on the following naming convention:

Asset type Naming convention Example
GCP Project systemAssetID>projectID
018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations
Dataplex Location systemAssetID>projectID>location 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2
Dataplex Lake systemAssetID>projectID>location>lakeID 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data
Dataplex Zone systemAssetID>projectID>location>lakeID>zoneID 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone
GCS Bucket systemAssetID>projectID>location>lakeID>zoneID>gcsBucket 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone>vb-test-data
Table systemAssetID>projectID>location>lakeID>zoneID>gcsBucket>tableName 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone>vb-test-data>type_csv
Column systemAssetID>projectID>location>lakeID>zoneID>gcsBucket>tableName>columnName(column) 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone>vb-test-data>type_csv>_c0(column)
Note systemAssetID is to be deprecated in a future release.

You can view the full name of an asset by editing the asset.

Warning Don't edit the full name of assets because the name is needed to synchronize or refresh data sources. Changing the full name may cause unexpected results and break the synchronization or refresh process.

Synchronized metadata per GCS asset type

This table shows the metadata for each Google Dataplex Catalog asset type. If you do not see any of the listed synchronized metadata, you can add characteristics to the layout on the asset type page.

Asset type Synchronized metadata Resource ID
GCP Project

Technology Asset groups / is grouped by Technology Asset

00000000-0000-0000-0000-00000000705
Dataplex Lake Lake Id 00000000-0000-0000-0001-002600000004
Location 00000000-0000-0000-0000-000000000203
Lake Status 00000000-0000-0000-0001-002600000005
Description from source system 00000000-0000-0000-0001-000500000074
URL 00000000-0000-0000-0000-000000000258
GCP Project groups / is grouped by Dataplex Lake 00000000-0000-0000-0001-002700000000
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog.  
Dataplex Zone Zone Name 00000000-0000-0000-0001-002600000007
Location 00000000-0000-0000-0000-000000000203
Any extensible properties defined via the capability.  
Description from source system 00000000-0000-0000-0001-000500000074
URL 00000000-0000-0000-0000-000000000258
Dataplex Lake contains / is part of Dataplex Zone 00000000-0000-0000-0001-002700000001
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog.  
GCS Bucket URL 00000000-0000-0000-0000-000000000258
Location 00000000-0000-0000-0000-000000000203
External System Label 00000000-0000-0000-0001-002730000001
GCS Bucket is part of Dataplex Zone 00000000-0000-0000-0001-002700000004
GCS Bucket contains Table 00000000-0000-0000-0001-002700000005
Table Any extensible properties defined via the capability.  

Schema contains / is part of Table

00000000-0000-0000-0001-002700000002
Description from source system 00000000-0000-0000-0001-000500000074
Column

Technical Data Type

00000000-0000-0000-0000-000000000219

Column Position

00000000-0000-0000-0001-000500000020

Is Nullable

00000000-0000-0000-0001-000500000011

Column is part of / contains Table

00000000-0000-0000-0000-000000007042
Description from source system 00000000-0000-0000-0001-000500000074

For the best result, we recommend that you use the following relations:

  1. GCP Project groups Dataplex Lake
  2. Dataplex Lake contains Dataplex Zone
  3. Dataplex Zone contains Schema
  4. Schema contains Table
  5. Table contains Column
Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

The following images shows the resulting hierarchical table.

Edge naming conventions for created Google BigQuery assets

The assets created when integrating Dataplex receive a unique full name (fully qualifying name) based on the following naming convention:

Asset type Naming convention Example
GCP Project systemAssetID>projectID
018c90f0-4a28-786f-9594-58ae8d88c5f9>integrations-automated-uer
Dataplex Lake systemAssetID>projectID>location>lakeID 018c90f0-4a28-786f-9594-58ae8d88c5f9>integrations-automated-uer>europe-west1>kasia-bq-lake
Dataplex Zone systemAssetID>projectID>location>lakeID>zoneID 018c90f0-4a28-786f-9594-58ae8d88c5f9>integrations-automated-uer>europe-west1>kasia-bq-lake>kasia_bq_zone
Schema systemAssetID>projectID>schemaName Big Query - ks>integrations-automated-uer>kasia_bq_dataset
Table systemAssetID>projectID>schemaName>tableName Big Query - ks>integrations-automated-uer>kasia_bq_dataset>kasia_bq_table
Column systemAssetID>projectID>schemaName>tableName>columnName(column) Big Query - ks>integrations-automated-uer>kasia_bq_dataset>kasia_bq_table>string_field_1(column)
Note systemAssetID is to be deprecated in a future release.

You can view the full name of an asset by editing the asset.

Warning Don't edit the full name of assets because the name is needed to synchronize or refresh data sources. Changing the full name may cause unexpected results and break the synchronization or refresh process.

Synchronized metadata per Google BigQuery asset type

This table shows the metadata for each Google Dataplex Catalog asset type. If you do not see any of the listed synchronized metadata, you can add characteristics to the layout on the asset type page.

Asset type Synchronized metadata Resource ID
GCP Project

Technology Asset groups / is grouped by Technology Asset

00000000-0000-0000-0000-00000000705
Dataplex Lake Lake Id 00000000-0000-0000-0001-002600000004
Location 00000000-0000-0000-0000-000000000203
Lake Status 00000000-0000-0000-0001-002600000005
Description from source system 00000000-0000-0000-0001-000500000074
URL 00000000-0000-0000-0000-000000000258
GCP Project groups / is grouped by Dataplex Lake 00000000-0000-0000-0001-002700000000
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog.  
Dataplex Zone Zone Name 00000000-0000-0000-0001-002600000007
Location 00000000-0000-0000-0000-000000000203
Any extensible properties defined via the capability.  
Description from source system 00000000-0000-0000-0001-000500000074
URL 00000000-0000-0000-0000-000000000258
Dataplex Lake contains / is part of Dataplex Zone 00000000-0000-0000-0001-002700000001
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog.  
Schema Data Source Type 00000000-0000-0000-0001-000500000018
Description from source system 00000000-0000-0000-0001-000500000074
Dataplex Zone contains / is part of Schema  
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog.  
Table Any extensible properties defined via the capability.  

Schema contains / is part of Table

00000000-0000-0000-0001-002700000002
Description from source system 00000000-0000-0000-0001-000500000074
Column

Technical Data Type

00000000-0000-0000-0000-000000000219

Column Position

00000000-0000-0000-0001-000500000020

Is Nullable

00000000-0000-0000-0001-000500000011

Column is part of / contains Table

00000000-0000-0000-0000-000000007042
Description from source system 00000000-0000-0000-0001-000500000074