Synchronized data via Google Dataplex ingestion
After the synchronization, assets become available in Collibra.
If no domain mappings are provided, the GCP Project asset and Lake assets are created in the same domain as the System asset.
Zone assets with all corresponding Files, Tables, Columns assets are created in new domains that use the following naming convention: {project name}>{lake name}>{zone name}. If you are integrating Google Dataplex Catalog with Google BigQuery, the Schema assets are added in the same domains as Zone assets.
If you registered the BigQuery data source via the BigQuery JDBC connector, and then integrate Google Dataplex Catalog, assets will be ingested in the same domains that were registered during JDBC ingestion. Specifically, Project assets are registered in the Database domains, and Zone assets are registered in the Schema domains. The mapping created by JDBC ingestion takes priority over the configurations in the Google Dataplex Catalog synchronization capability. In this way, no duplicated tables or columns are created. For more information, go to Ways to work with Google Cloud Platform (GCP).
Warning Do not move the assets to another domain. Doing so may lead to errors during future synchronizations.
Tip Google Dataplex Catalog synchronization relies on UUIDs.
By default, the assets get the Implemented status.
The operating model for Google Dataplex Catalog with Google BigQuery includes the Schema asset type, whereas the operating models for Google Dataplex Catalog with Google Cloud Storage (GCS) assets do no. There are also two operating models for Google Dataplex Catalog with GCS assets; one includes the GCS bucket, and the other does not. Therefore, the synchronization results and Edge naming conventions differ as shown in the following information.
By default, the resulting assets are shown in a plain list, but you can enable a multi-path hierarchy to show it in a tree structure.
- Synchronization results for GCS assets without GCS Bucket
- Synchronization results for GCS assets with GCS bucket
- Synchronization results for Google BigQuery
For the best result, we recommend that you use the following relations:
- GCP Project groups Dataplex Lake
- Dataplex Lake contains Dataplex Zone
- Dataplex Zone contains Table
- Table contains Column
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
The following images shows the resulting hierarchical table:
Edge naming conventions for created GCS assets without GCS Bucket
The assets created when integrating Dataplex receive a unique full name (fully qualifying name) based on the following naming convention:
Asset type | Naming convention | Example |
---|---|---|
GCP Project | systemAssetID>projectID |
915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer |
Dataplex Location | systemAssetID>projectID>location | 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1 |
Dataplex Lake | systemAssetID>projectID>location>lakeID | 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1 |
Dataplex Zone | systemAssetID>projectID>location>lakeID>zoneID | 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1>vb_mapping_zone_1 |
Table | systemAssetID>projectID>location>lakeID>zoneID>tableName | 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1>vb_mapping_zone_1>zone_1 |
Column | systemAssetID>projectID>location>lakeID>zoneID>tableName>columnName(column) | 915b4a91-2aa1-4554-aca2-c0dd9c029253>integrations-automated-uer>europe-west1>vb-mapping-test-1>vb_mapping_zone_1>zone_1>_c0(column) |
Note systemAssetID is to be deprecated in a future release.
|
You can view the full name of an asset by editing the asset.
Warning Don't edit the full name of assets because the name is needed to synchronize or refresh data sources. Changing the full name may cause unexpected results and break the synchronization or refresh process.
Synchronized metadata per GCS asset type
This table shows the metadata for each Google Dataplex Catalog asset type.
Asset type | Synchronized metadata | Resource ID |
---|---|---|
GCP Project |
Technology Asset groups / is grouped by Technology Asset |
00000000-0000-0000-0000-00000000705 |
Dataplex Lake | Labels | 00000000-0000-0000-0001-002600000003 |
Lake Id | 00000000-0000-0000-0001-002600000004 | |
Location | 00000000-0000-0000-0000-000000000203 | |
Lake Status | 00000000-0000-0000-0001-002600000005 | |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
URL | 00000000-0000-0000-0000-000000000258 | |
GCP Project groups / is grouped by Dataplex Lake | 00000000-0000-0000-0001-002700000000 | |
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog. | ||
Dataplex Zone | Labels | 00000000-0000-0000-0001-002600000003 |
Zone Name | 00000000-0000-0000-0001-002600000007 | |
Location | 00000000-0000-0000-0000-000000000203 | |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
URL | 00000000-0000-0000-0000-000000000258 | |
Any extensible properties defined via the capability. | ||
Dataplex Lake contains / is part of Dataplex Zone | 00000000-0000-0000-0001-002700000001 | |
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog. | ||
Table | Any extensible properties defined via the capability. | |
Schema contains / is part of Table |
00000000-0000-0000-0001-002700000002 | |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
Column |
Technical Data Type |
00000000-0000-0000-0000-000000000219 |
Column Position |
00000000-0000-0000-0001-000500000020 | |
Is Nullable |
00000000-0000-0000-0001-000500000011 | |
Column is part of / contains Table |
00000000-0000-0000-0000-000000007042 | |
Description from source system | 00000000-0000-0000-0001-000500000074 |
For the best result, we recommend that you use the following relations:
- GCP Project groups Dataplex Lake
- Dataplex Lake contains Dataplex Zone
- Dataplex Zone contains GCS Bucket
- GCS Bucket contains Table
- Table contains Column
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
The following images shows the resulting hierarchical table:
Edge naming conventions for created GCS assets with GCS Bucket
The assets created when integrating Dataplex receive a unique full name (fully qualifying name) based on the following naming convention:
Asset type | Naming convention | Example |
---|---|---|
GCP Project | systemAssetID>projectID |
018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations |
Dataplex Location | systemAssetID>projectID>location | 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2 |
Dataplex Lake | systemAssetID>projectID>location>lakeID | 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data |
Dataplex Zone | systemAssetID>projectID>location>lakeID>zoneID | 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone |
GCS Bucket | systemAssetID>projectID>location>lakeID>zoneID>gcsBucket | 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone>vb-test-data |
Table | systemAssetID>projectID>location>lakeID>zoneID>gcsBucket>tableName | 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone>vb-test-data>type_csv |
Column | systemAssetID>projectID>location>lakeID>zoneID>gcsBucket>tableName>columnName(column) | 018e50b8-d92c-72c0-a421-812b6c1d594a>catalog-integrations>europe-central2>vb-test-data>vb_test_data_zone>vb-test-data>type_csv>_c0(column) |
Note systemAssetID is to be deprecated in a future release.
|
You can view the full name of an asset by editing the asset.
Warning Don't edit the full name of assets because the name is needed to synchronize or refresh data sources. Changing the full name may cause unexpected results and break the synchronization or refresh process.
Synchronized metadata per GCS asset type
This table shows the metadata for each Google Dataplex Catalog asset type.
Asset type | Synchronized metadata | Resource ID |
---|---|---|
GCP Project |
Technology Asset groups / is grouped by Technology Asset |
00000000-0000-0000-0000-00000000705 |
Dataplex Lake | Lake Id | 00000000-0000-0000-0001-002600000004 |
Location | 00000000-0000-0000-0000-000000000203 | |
Lake Status | 00000000-0000-0000-0001-002600000005 | |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
URL | 00000000-0000-0000-0000-000000000258 | |
GCP Project groups / is grouped by Dataplex Lake | 00000000-0000-0000-0001-002700000000 | |
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog. | ||
Dataplex Zone | Zone Name | 00000000-0000-0000-0001-002600000007 |
Location | 00000000-0000-0000-0000-000000000203 | |
Any extensible properties defined via the capability. | ||
Description from source system | 00000000-0000-0000-0001-000500000074 | |
URL | 00000000-0000-0000-0000-000000000258 | |
Dataplex Lake contains / is part of Dataplex Zone | 00000000-0000-0000-0001-002700000001 | |
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog. | ||
GCS Bucket | URL | 00000000-0000-0000-0000-000000000258 |
Location | 00000000-0000-0000-0000-000000000203 | |
External System Label | 00000000-0000-0000-0001-002730000001 | |
GCS Bucket is part of Dataplex Zone | 00000000-0000-0000-0001-002700000004 | |
GCS Bucket contains Table | 00000000-0000-0000-0001-002700000005 | |
Table | Any extensible properties defined via the capability. | |
Schema contains / is part of Table |
00000000-0000-0000-0001-002700000002 | |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
Column |
Technical Data Type |
00000000-0000-0000-0000-000000000219 |
Column Position |
00000000-0000-0000-0001-000500000020 | |
Is Nullable |
00000000-0000-0000-0001-000500000011 | |
Column is part of / contains Table |
00000000-0000-0000-0000-000000007042 | |
Description from source system | 00000000-0000-0000-0001-000500000074 |
For the best result, we recommend that you use the following relations:
- GCP Project groups Dataplex Lake
- Dataplex Lake contains Dataplex Zone
- Dataplex Zone contains Schema
- Schema contains Table
- Table contains Column
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
The following images shows the resulting hierarchical table.
Edge naming conventions for created Google BigQuery assets
The assets created when integrating Dataplex receive a unique full name (fully qualifying name) based on the following naming convention:
Asset type | Naming convention | Example |
---|---|---|
GCP Project | systemAssetID>projectID |
018c90f0-4a28-786f-9594-58ae8d88c5f9>integrations-automated-uer |
Dataplex Lake | systemAssetID>projectID>location>lakeID | 018c90f0-4a28-786f-9594-58ae8d88c5f9>integrations-automated-uer>europe-west1>kasia-bq-lake |
Dataplex Zone | systemAssetID>projectID>location>lakeID>zoneID | 018c90f0-4a28-786f-9594-58ae8d88c5f9>integrations-automated-uer>europe-west1>kasia-bq-lake>kasia_bq_zone |
Schema | systemAssetID>projectID>schemaName | Big Query - ks>integrations-automated-uer>kasia_bq_dataset |
Table | systemAssetID>projectID>schemaName>tableName | Big Query - ks>integrations-automated-uer>kasia_bq_dataset>kasia_bq_table |
Column | systemAssetID>projectID>schemaName>tableName>columnName(column) | Big Query - ks>integrations-automated-uer>kasia_bq_dataset>kasia_bq_table>string_field_1(column) |
Note systemAssetID is to be deprecated in a future release.
|
You can view the full name of an asset by editing the asset.
Warning Don't edit the full name of assets because the name is needed to synchronize or refresh data sources. Changing the full name may cause unexpected results and break the synchronization or refresh process.
Synchronized metadata per Google BigQuery asset type
This table shows the metadata for each Google Dataplex Catalog asset type.
Asset type | Synchronized metadata | Resource ID |
---|---|---|
GCP Project |
Technology Asset groups / is grouped by Technology Asset |
00000000-0000-0000-0000-00000000705 |
Dataplex Lake | Lake Id | 00000000-0000-0000-0001-002600000004 |
Location | 00000000-0000-0000-0000-000000000203 | |
Lake Status | 00000000-0000-0000-0001-002600000005 | |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
URL | 00000000-0000-0000-0000-000000000258 | |
GCP Project groups / is grouped by Dataplex Lake | 00000000-0000-0000-0001-002700000000 | |
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog. | ||
Dataplex Zone | Zone Name | 00000000-0000-0000-0001-002600000007 |
Location | 00000000-0000-0000-0000-000000000203 | |
Any extensible properties defined via the capability. | ||
Description from source system | 00000000-0000-0000-0001-000500000074 | |
URL | 00000000-0000-0000-0000-000000000258 | |
Dataplex Lake contains / is part of Dataplex Zone | 00000000-0000-0000-0001-002700000001 | |
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog. | ||
Schema | Data Source Type | 00000000-0000-0000-0001-000500000018 |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
Dataplex Zone contains / is part of Schema | ||
Any custom attribute that you added. You can use the custom attributes for custom label mapping when you synchronize Google Dataplex Catalog. | ||
Table | Any extensible properties defined via the capability. | |
Schema contains / is part of Table |
00000000-0000-0000-0001-002700000002 | |
Description from source system | 00000000-0000-0000-0001-000500000074 | |
Column |
Technical Data Type |
00000000-0000-0000-0000-000000000219 |
Column Position |
00000000-0000-0000-0001-000500000020 | |
Is Nullable |
00000000-0000-0000-0001-000500000011 | |
Column is part of / contains Table |
00000000-0000-0000-0000-000000007042 | |
Description from source system | 00000000-0000-0000-0001-000500000074 |