Amazon S3 file and asset types

The Amazon S3 file system integration uses a specific subset of asset types. These are available out of the box. For more information on asset attributes or metadata synchronized per asset type, go to Integrated Amazon S3 data.

Supported file types

Amazon S3 can contain a wide range of objects in different file types. However, not all file types are fully supported due to limitations of AWS Glue.

The following list shows the file types that are supported by Collibra. Note that other file types may work properly as well. For an exhaustive list of supported file types, go to the AWS Glue documentation.

  • AVRO
  • ORC
  • PARQUET
  • JSON
  • BSON
  • XML
  • ION
  • COMBINED_APACHE
  • APACHE
  • LINUX_KERNEL
  • RUBY_LOGGER
  • SQUID
  • REDISMONLOG
  • REDISLOG
  • CSV
  • ZIP
  • TAR
  • RAR
  • GZ
  • JAR

Supported asset types

Asset type Description
Data Asset  Data Element 
Column

An atomic unit of data that can be stored in a database table.

Examples: FST_NM, EMPID

Data Asset  Data Structure 
Table

An implementation of data entities in columns and rows, in a given database system. It is the basic structure of a relational database.

Examples: Account_tbl, CUST_ADDR

Data Asset  Data Structure Table 
Database View

A virtual table based on the result-set of an SQL statement.

Technology Asset 
Storage Container

An asset type that represents a Storage Container.

Technology Asset  Storage Container
Directory

A collection of data that is treated by a computer as a unit, for the purposes of input and output.

Technology Asset  Storage Container
S3 Bucket

An asset type that represents an Amazon S3 Bucket, which is a logical unit of storage containing Amazon S3 Objects.

Technology Asset  System 
File Storage

An asset type that represents a Cloud File Storage bucket.

Technology Asset  System File Storage
S3 File System

Amazon S3 (Simple Storage Service) file system abstraction.

Amazon S3 diagram view

The following image shows the relations between Amazon S3 asset types and the cardinality of the relation types in the asset type assignment.