Synchronize Amazon S3

When you synchronize Amazon S3, the content of your Amazon S3 repository is analyzed and represented as assets and their characteristics. You can synchronize manually or automate the process by adding a synchronization schedule.

Technically, the synchronization happens in several steps:

Warning Do not move the assets to another domain. Doing so may lead to errors during future synchronizations. This is a known limitation.

Naming convention

Synchronizing Amazon S3 relies on a naming convention to match assets during the synchronization process. We highly recommend that you do not change the full name of the S3 File System asset.

Warning Editing full name of the S3 File System assets may lead to errors during the synchronization process.

Prerequisites

In your Collibra environment

In your AWS environment

Steps

  1. Open an S3 File System asset page.
  2. In the tab bar, click Configuration.
  3. In the Crawlers section, click Synchronize.

    A notification indicates synchronization has started.

    The synchronization job appears in the Activities list as a bulk synchronization.

    The Synchronization Schedule section displays the time of the last synchronization.

Note In case of a partial synchronization caused by a temporary communication issue, the status of the assets that cannot be synchronized is set to Missing from source. During the next fully successful synchronization, the assets are removed or their previous status is restored, depending on their actual status in the source system.

  1. Open an S3 File System asset page.
  2. In the tab bar, click Configuration.
  3. In the Synchronization Schedule section, click Add Schedule.
  4. Enter the required information.
    FieldDescription
    RepeatThe interval when you want to synchronize automatically. The possible values are: Daily, Weekly, Monthly, and Cron expression.
    Cron

    The Quartz Cron expression that determines when the synchronization takes place.

    This field is only visible if you select Cron expression in the Repeat field.

    Every

    The day on which you want to synchronize, for example, Sunday.

    This field is only visible if you select Weekly in the Repeat field.

    Every first

    The day of the month on which you want to synchronize, for example, Tuesday.

    This field is only visible if you select Monthly in the Repeat field.

    At

    The time at which you want to synchronize automatically, for example, 14:00.

    • You can only schedule on the hour. For example, you can add a synchronization schedule at 8:00, but not at 8:45.
    • This field is only visible if you select Daily, Weekly, or Monthly in the Repeat field.
    Time zoneThe time zone for the schedule.
  5. Click Save.
You can also edit or remove an S3 synchronization schedule.

What's next

After the synchronization: