Register an Amazon S3 file system via the AWS Glue JDBC connector and Edge

The Amazon S3 file system registration via the AWS Glue JDBC connector allows for the registration of an Amazon S3 file system as a data source and the synchronization of Amazon S3 metadata in Collibra, representing the S3 tables and columns in Collibra.

Follow the steps below to register an Amazon S3 file system via Edge.

 

Step

What?

Description

Results

Preparation

0

Make sure the following settings are enabled:

Makes sure the required settings are enabled.

Your environment is ready for Edge.

1

Prepare your Edge site

Ensures you have an Edge site with and AWS Glue connection for Amazon S3 and the required capabilities.

 
Setup

2

Register the data source

Registering a data source creates the structure for the metadata in Collibra.

  • A Physical Data Dictionary domain containing a Database asset is created.
  • A list of available schemas is created on the Configuration tab page of the Database asset.
3

Configure the synchronization of your data source

Making a selection of schemas and tables that you want to ingest.

The information on the Configuration tab page of the Database asset is filled in.

Registration

4

Synchronizing the schema of a registered data source to make the metadata available in Collibra.

Schema, Table, Column and Foreign Keys assets are created in the specified domain, and registration data becomes available.

5 If needed, profile and classify the synchronized data.

Data profiling creates a summary of a data source that is registered with Data Catalog and determines the data type of columns in the data source. The summary mainly contains statistics and graphics to give the user an idea what the registered data is about.

Classification analyzes and predicts the content of registered data sources based on a subset of the data itself, helping you to easily gain insights on what kinds of data you have and where it resides.

The Table and Column assets contain profiling information and the Columns are classified.