Edit a crawler

You can edit a crawler of an S3 File System asset in Data Catalog. For example, you can do this if you want to edit the exclude pattern.

Prerequisites

  • You have a resource role with the Configure external system resource permission, for example Owner.
  • You have a global role with the Catalog global permission, for example Catalog Author.
  • You have registered an Amazon S3 file system.
  • You have configured one or more Jobservers in Collibra Console. If there is no available Jobserver, the Register data source actions will be grayed out in the global create menu of Collibra Data Intelligence Cloud.
  • You have connected an S3 File System asset to Amazon S3.
  • You have created one or more crawlers.

Steps

  1. Open an S3 File System asset page.
  2. In the tab pane, click Services Configuration.
  3. In the Crawlers section, in the row of the crawler that you want to edit, click .
    The Edit crawler window appears.
  4. Enter the required information.
    FieldDescription

    Domain

    The domain in which the assets of the S3 file system are created.

    Name

    The name of the crawler in Collibra.

    Include path

    The case-sensitive path to a directory of a bucket in Amazon S3. All objects and subdirectories of this path are crawled.

    For more information and examples, see the AWS Glue documentation.

    Exclude patterns

    Glob pattern that represents the objects that are in the include path, but that you want to exclude.

    For more information and examples, see the AWS Glue documentation.

    Add patternButton to add additional exclude patterns.
  5. Click Save.