Synchronize Amazon SageMaker
Synchronizing AWS SageMaker AI is the process of integrating metadata from Amazon SageMaker and making the data available in Collibra Platform.
You can synchronize manually, or you can automate it by adding a synchronization schedule.
Before you begin
- You have created a AWS connection.
- You have added the AWS SageMaker AI capability to the AWS connection.
Required permissions
- You have a resource role with the Configure external system resource permission, for example, Owner.
- You have a global role with the Catalog global permission, for example, Catalog Author.
- You have a global role with the View Edge connections and capabilities global permission, for example, Edge integration engineer. For example, Edge integration engineer.
Steps
-
On the main toolbar, click
→
Catalog.
The Catalog homepage opens. - Click Integrations.
The Integrations page opens. - Click the Integration configuration tab.
- In the Connection name column, locate the AWS connection that you used when you added the AWS SageMaker AI capability and click the link in the Capabilities column.
The Synchronization page opens. - In the Synchronization Configuration section, click Add Configuration.
- In Domain, select the Domain asset in which you want to add the AWS SageMaker AI assets.
Important Ensure that you select a domain of the type Technology Asset Domain.
- Optionally, in AWS Regions, select the region of the AWS SageMaker AI assets. If no regions are selected, the integration searches all regions where Amazon SageMaker is available.
- Optionally, select Yes under Ingest deployment output if you want to ingest the S3 endpoint output paths as File and Storage container assets, which are linked to each AI Model Deployment asset.
-
Optionally, in Custom AI Metrics Mappings, define which custom AWS SageMaker AI model metrics you want to integrate. To do this, add the mapping between the custom metric and the Collibra attribute.
Show detailsImportant
If you use this feature, add any custom attributes/characteristics, as needed, to the asset type assignment.
- Click Add Custom AI Metrics Mappings.
- In Metric, type the name of the custom metric manually.
Use the exact name as in AWS SageMaker AI asset. - In Attribute, select the attribute in which you want to see the value.
Make sure to select an attribute that is included in the AWS SageMaker AI Model asset type assignment.
- Click Save.
- Click Synchronize.
A notification indicates the synchronization has started.
-
On the main toolbar, click
→
Catalog.
The Catalog homepage opens. - Click Integrations.
The Integrations page opens. - Click the Integration configuration tab.
- In the Connection name column, locate the AWS connection that you used when you added the AWS SageMaker AI capability and click the link in the Capabilities column.
The Synchronization page opens. - In the Synchronization Configuration section, click Add Configuration.
- In Domain, select the Domain asset in which you want to add the AWS SageMaker AI assets.
Important Ensure that you select a domain of the type Technology Asset Domain.
- Optionally, in AWS Regions, select the region of the AWS SageMaker AI assets. If no regions are selected, the integration searches all regions where Amazon SageMaker is available.
- Optionally, select Yes under Ingest deployment output if you want to ingest the S3 endpoint output paths as File and Storage container assets, which are linked to each AI Model Deployment asset.
-
Optionally, in Custom AI Metrics Mappings, define which custom AWS SageMaker AI model metrics you want to integrate. To do this, add the mapping between the custom metric and the Collibra attribute.
Show detailsImportant
If you use this feature, add any custom attributes/characteristics, as needed, to the asset type assignment.
- Click Add Custom AI Metrics Mappings.
- In Metric, type the name of the custom metric manually.
Use the exact name as in AWS SageMaker AI asset. - In Attribute, select the attribute in which you want to see the value.
Make sure to select an attribute that is included in the AWS SageMaker AI Model asset type assignment.
- Click Save.
- In the Synchronization Schedule section, click Add schedule.
- Enter the required information and click Save:
Field Description Repeat The interval when you want to synchronize automatically. The possible values are: Daily, Weekly, Monthly, and Cron expression. CronThe Quartz Cron expression that determines when the synchronization takes place.
This field is only visible if you select
Cron expressionin the Repeat field.EveryThe day on which you want to synchronize, for example, Sunday.
This field is only visible if you select
Weeklyin the Repeat field.Every firstThe day of the month on which you want to synchronize, for example, Tuesday.
This field is only visible if you select
Monthlyin the Repeat field.At
The time at which you want to synchronize automatically, for example, 14:00.
- You can only schedule on the hour. For example, you can add a synchronization schedule at 8:00, but not at 8:45.
- This field is only visible if you select
Daily,Weekly, orMonthlyin the Repeat field.
Time zone The time zone for the schedule.
The synchronization job synchronizes the AWS SageMaker AI data.
After the synchronization:
- You can view a summary of the results from the Activities list.
- The resulting assets get a relation to the Domain that you selected.
For information on the integrated data, go to Synchronized AWS SageMaker AI data.