Add the Amazon SageMaker Unified Studio capability

After you have created a connection to Amazon SageMaker Unified Studio in your Edge or Collibra Cloud site, you can add the SageMaker Unified Studio data catalog capability to the connection.

Prerequisites

In your Collibra environment:

Steps

  1. Open a site.
    1. On the main toolbar, click Products iconCogwheel icon Settings.
      The Settings page opens.
    2. In the tab pane, click Edge.
      The Sites tab opens and shows a table with an overview of your sites.
    3. In the table, click the name of the site whose status is Healthy.
      The site page opens.
  2. In the Capabilities section, click Add capability.
    The Create capability page appears.
  3. Select SageMaker Unified Studio data catalog.
  4. Enter the required information.
    FieldDescriptionRequired
    Capability

    This section contains general information about the capability.

    Name

    The name of the capability.

    Yes

    Description

    The description of the capability.

    No

    AWS ConnectionThe AWS connection to be used.

    Yes

    Create additional capabilities automatically

    Choose which capabilities to create automatically during synchronization if they do not already exist.

    Select one of the following options:

    • Do not create additional capabilities: None of the following capabilities are created during synchronization: Catalog JDBC Ingestion, JDBC Profiling, and Catalog JDBC Classification.
    • Create JDBC ingestion, profiling, and classification capabilities automatically: The following capabilities are created automatically, Catalog JDBC Ingestion, JDBC Profiling, and Catalog JDBC Classification. This is the default option.
    • Create JDBC ingestion, profiling, classification, and sampling capabilities automatically: The following capabilities are created automatically, Catalog JDBC Ingestion, JDBC Profiling, Catalog JDBC Classification, and Catalog JDBC Sampling.

    No

    Save Input Metadata

    Select the checkbox if you want to save the input metadata extracted from the data source in ZIP files. The files can be useful for troubleshooting. Select this option only on request of Collibra Support. If this option is selected, you can download the files from the Synchronization Result dialog box once the synchronization activity is completed.

    No

    Advanced Configuration

    These configuration options help when investigating issues with the capability.

    Important 
    • Only complete the fields Save Input Metadata, Logging configuration, Memory (MiB), and JVM arguments on request of or together with Collibra Support.
    • Only use Log level if your data source is a commercial JDBC offering. For more information, go to the Collibra Marketplace.

    No

    Debug

    This field is ignored when you integrate metadata from SageMaker Unified Studio.

    An option to automatically send Edge infrastructure log files to Collibra Platform. By default, this option is set to false.

    Note We highly recommend to only send Edge infrastructure log files to Collibra Platform when you have issues with Edge. If you set it to true, it will automatically revert to false after 24h.

    No

    Log level

    This field is ignored when you integrate metadata from the SageMaker Unified Studio.

    An option to determine the verbosity level of Catalog connector log files. By default, this option is set to No logging.

    No

  5. Click Add.
    The capability is added to the Edge or Collibra Cloud site.
    The fields become read-only.

What's next?

You can synchronize the SageMaker Unified Studio data catalog capability.