Register an Amazon S3 file system via Edge

You can configure Collibra to register and synchronize an Amazon S3 file system via Edge.

Amazon S3 is an online object storage service hosted by Amazon. For more information about Amazon S3, see the Amazon S3 documentation.

Workflow

The following table shows the steps required to register an Amazon S3 file system via Edge.

# Step Description

1

Enable the Amazon S3 file system registration and synchronization via Edge

Enable S3 integration via Edge.

2

Create an Edge site.

Create an Edge site to have a processing runtime at your premises.

Usually, you install one Edge site per virtual or physical network, security domain, virtual private cloud (VPC) or within your public Cloud account.

3

Install the Edge site.

Install an Edge site to make the Edge software run on a server, close to your data source.

Note Make sure that you have the necessary system requirements to install the Edge site.

4 Add an AWS connection. Add connection details to an Edge site to create a connection to Amazon S3.

5

Add an S3 synchronization capability. Add an S3 synchronization capability to an Edge site to retrieve data from Amazon S3.
6 Register an Amazon S3 file system. Create the initial structure of an Storage Catalog domain and S3 File System asset in the selected parent community.
7 Connect a file system asset to Amazon S3. Link the registered S3 file system to an Edge capability to connect to Amazon S3.
8 Create crawlers. Create crawlers to find and ingest the Amazon S3 metadata.
9 Synchronize Amazon S3.

Run the crawlers to ingest the data of Amazon S3.

You can add a synchronization schedule to automatically synchronize Amazon S3 or you can manually synchronize it.