Integrated Google Cloud Storage data

Important 

In Collibra 2024.05, we launched a new user interface (UI) for Collibra Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

After the synchronization, the resulting assets are in the domain that was specified in the crawler. The status of assets depends on the selected value in the Default Asset Status field in the capability.If No Status is selected, newly created assets receive the first status listed in your Operating Model statuses, and existing assets keep their assigned status.If Implemented is selected, all assets receive the Implemented status.

Warning Do not move the assets to another domain. Doing so may lead to errors during future synchronizations.

Tip GCS synchronization relies on UUIDs.

Note If a temporary communication issue results in a partial synchronization, the status of the assets that were not synchronized becomes Missing from source. If the assets are identified in the source system during the next fully successful synchronization, the previous statuses are restored.

By default, the assets are shown in a plain list, but you can enable a multi-path hierarchy to show it in a tree structure. The resulting assets depend on whether you use Google Dataplex.

Synchronization results without Google Dataplex

For the best result, we recommend that you use the following relations:

  1. File Storage contains Storage Container
  2. Storage Container contains Storage Container
  3. Storage Container contains File
  4. Directory contains Directory

The following images shows the resulting hierarchical table.

Synchronization results with Google Dataplex

For the best result, we recommend that you use the following relations:

  1. File Storage contains Storage Container
  2. Storage Container contains Storage Container
  3. Storage Container contains File
  4. Directory contains Directory
  5. Directory contains File Group
  6. File Group contains Table
  7. Table contains Column


The synchronization creates a Directory asset named /. This is needed to ensure the asset can contain the File Group assets.

Synchronized metadata per asset type

This table shows the metadata for each GCS asset type.

Asset type

Synchronized metadata

Public ID
GCS Bucket File Storage contains/ is part of Storage Container FileStorageContainsFileContainer
Location Location
Directory

URL

Url
Storage Container contains/ is part of Storage Container FileContainerContainsFileContainer
Directory contains/ is part of Directory DirectoryContainsDirectory
File Group File Type FileType
Directory contains/ is part of File Group DirectoryContainsFileGroup
File URL Url
Storage Container contains/ contained in File FileContainerContainsFile
Table

Description

Description
File Group contains/ is part of Table FileGroupContainsTable
Column

Description

Description
Technical Data Type TechnicalDataType
Column Position ColumnPosition
Is Nullable IsNullable
Column is part of/ contains Table ColumnIsPartOfTable