Integrated Amazon S3 data

Important 

In Collibra 2024.02, we've launched a new user interface (UI) in beta for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

After you have synchronized the data, the integration of the Amazon S3 file system is completed.

Synchronization results

After synchronization, the resulting assets are in the domain that was specified in the crawler.

Warning Do not move the assets to another domain. Doing so may lead to errors during future synchronizations. This is a known limitation.

By default, the assets are shown in a plain list, but you can enable a multi-path hierarchy to show it in a tree structure. For the best result, we recommend that you use the following relations:

  1. File Container contains File Container
  2. Directory contains Directory
  3. File container contains File
  4. Directory contains File Group
  5. File contains Table
  6. File Group contains Table
  7. Table contains Column

The following images shows the resulting hierarchical table.

Note In case of a partial synchronization caused by a temporary communication issue, the status of the assets that cannot be synchronized is set to Missing from source. During the next fully successful synchronization, the assets are removed or their previous status is restored, depending on their actual status in the source system.

Synchronized metadata per asset type

This table shows the metadata for each Amazon S3 asset type.

Asset type

Synchronized metadata

Resource ID
S3 Bucket

URL

00000000-0000-0000-0000-000000000258

Location

00000000-0000-0000-0000-000000000203
File Storage contains/ is part of File Container 00000000-0000-0000-0001-002600000000
Directory

URL

00000000-0000-0000-0000-000000000258
File Container contains/ is part of File Container 00000000-0000-0000-0001-002600000001
Directory contains/ is part of Directory 00000000-0000-0000-0001-002600000003
File Group URL 00000000-0000-0000-0000-000000000258
File Type 00000000-0000-0000-0001-002500000012
Document Size 00000000-0000-0000-0000-000000000259
Number of Files 00000000-0000-0000-0001-002500000001
Directory contains/ is part of File Group 00000000-0000-0000-0001-002600000004
File URL 00000000-0000-0000-0000-000000000258
File Type 00000000-0000-0000-0001-002500000012
Document Size 00000000-0000-0000-0000-000000000259
File Container contains/ contained in File 00000000-0000-0000-0000-000000007060
Table Glue database name 00000000-0000-0000-0001-000500000066
Glue table name 00000000-0000-0000-0001-000500000067
File contains/ is part of Table 00000000-0000-0000-0001-002600000002
File Group contains/ is part of Table 00000000-0000-0000-0001-002600000005
Column Technical Data Type 00000000-0000-0000-0000-000000000219
Column Position 00000000-0000-0000-0001-000500000020
Column is part of/ contains Table 00000000-0000-0000-0000-000000007042