Assets, domain types, and operating model for the Google Dataplex ingestion

The Google Dataplex ingestion uses a specific subset of asset types. All of these come out-of -the-box with Collibra.

Asset type Description
GCP Project

An asset type representing a Google Cloud Platform Project.

Dataplex Zone

A zone represents a logical group of related assets within a Dataplex lake.

Dataplex Lake

A lake is a centralized repository for managing enterprise data across the organization.

Schema
A schema is the highest level of physical structure in a Database. It defines the structure of the tables and columns in the database. This asset type is exclusively used for Google BigQuery data sources integrated from Google Dataplex Catalog and represents datasets in Google BigQuery data sources.
GCS Bucket
An asset type that represents an Google Cloud Storage bucket, which is a logical unit of storage containing Google Cloud Storage objects.
Table

An implementation of data entities in columns and rows, in a given database system. It is the basic structure of a relational database.
Examples: Account_tbl, CUST_ADDR

Column

An atomic unit of data that can be stored in a database table.
Examples: FST_NM, EMPID

Google Dataplex operating models

There are different operating models for the integration of Google Dataplex Catalog:

  • The operating model for Google Cloud Storage (GCS) assets that does not include the GCS Bucket asset type
  • The operating model for GCS assets that includes the GCS Bucket asset type
  • The operating model for Google BigQuery assets that includes the Schema asset type.