Register an Amazon S3 file system via the AWS Glue JDBC connector and Edge
If you register an Amazon S3 file system via the AWS (Amazon Web Services) Glue JDBC connector, the resulting assets represent the columns and the tables in Amazon S3 without the folder context. You can profile and classify the data, but the folder structure of your Amazon S3 environment isn't represented in Data Catalog. Note that the AWS Glue JDBC connector leverages the Athena JDBC driver.
Follow the steps below to register an Amazon S3 file system via Edge.
|
Step |
What? |
Description |
Results |
|
|---|---|---|---|---|
| Preparation |
0 |
Make sure the following settings are enabled:
|
Makes sure the required settings are enabled. |
Your environment is ready for Edge. |
|
1 |
Prepare your Edge site |
Ensures you have an Edge site with and AWS Glue connection for Amazon S3 and the required capabilities. |
||
| Setup |
2 |
Register the data source |
Registering a data source creates the structure for the metadata in Collibra. |
|
| 3 |
Making a selection of schemas and tables that you want to ingest. |
The information on the Configuration tab page of the Database asset is filled in. |
||
| Registration |
4 |
Synchronizing the schema of a registered data source to make the metadata available in Collibra. |
Schema, Table, Column and Foreign Keys assets are created in the specified domain, and registration data becomes available. |
|
| 5 | If needed, profile and classify the synchronized data. |
Data Profiling creates a summary of a data source in Data Catalog and determines the data type of columns in the data source. The summary mainly contains statistics and graphics to give the user an idea what the data is about. Data Profiling is available for registered JDBC data sources Classification analyzes and predicts the content of registered data sources based on a subset of the data itself, helping you to easily gain insights on what kinds of data you have and where it resides. |
The Table and Column assets contain profiling information and the Columns are classified. |