Google Dataplex ingestion via Edge
Google Dataplex Catalog is a technical catalog on Google that provides information for data in the various Dataplex projects. If you integrate Google Dataplex Catalog, you integrate the metadata of all data of Dataplex projects into Collibra Data Intelligence Platform.
The Google Dataplex ingestion is based on Dataplex and results in assets that represent the projects, lakes, zones, tables, and columns.
- Because we only integrate the metadata, you cannot get sample data for the columns and tables, nor profile and classify them. If you want to get samples, and profile and classify the data, you can combine the integration of Google Dataplex Catalog with the registration of a Bigquery data source. For more information, go to Methods to work with Google Cloud Platform (GCP).
- The current Dataplex GCS discovery system has a limit of 1,000 tables per bucket.
The following images show the asset types in Collibra after the integration of Google Dataplex Catalog via the Google Dataplex ingestion. The asset types from Google Cloud Storage (GCS) assets can either contain the GCS Bucket asset type or not. The asset types from the integration of Google Dataplex Catalog with Google BigQuery include the Schema asset type.
- GCS assets without GCS Bucket
- GCS assets with GCS Bucket
- Google BigQuery
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
In Collibra 2024.05, we launched a new user interface (UI) for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.
Use the following options to see the documentation in the latest UI or in the previous, classic UI:
You can integrate Google Dataplex Catalog only via Edge, not via Jobserver.
For information on the Google Dataplex Catalog, go to the Google documentation.
For information on the supported data types, go to the data types Google documentation.
Note When you add a bucket to Dataplex and Dataplex identifies schemas (tables and columns) for files in the bucket, these tables and columns are also added automatically to BigQuery by Dataplex.