Synchronize Snowflake Cortex AI

Synchronizing Snowflake Cortex AI is the process of integrating metadata from Snowflake Cortex AI and making the data available in Collibra Platform.

You can synchronize manually, or you can automate it by adding a synchronization schedule.

Prerequisites

In your Collibra environment

In your Snowflake Cortex AI environment

Your Snowflake Cortex AI integration role must have the following permissions:

Note 
  • Permissions are granted to roles, not individual users.
  • Both USAGE ON DATABASE and USAGE ON SCHEMA permissions on the integration role are required. Only models and agents that the Snowflake Cortex AI integration role has visibility to are ingested. Models and agents that aren't visible to this role, but are included in any path pattern provided below, are skipped. You can check what models and agents the role has visibility to by running SHOW <MODELS/AGENTS> IN ACCOUNT in Snowflake. If you have a secondary role, add USE SECONDARY ROLE NONE to exclude any models or agents only your secondary role has access to.

  • USAGE permission on:
    • The database containing the models or agents.
    • The schema containing the models or agents.
    • The models or agents you want to ingest. You can scope this to a specific mode or agent, or to all models or agents in a database.
  • If you plan to ingest Snowflake-provided foundational (CORTEX_BASE) models, grant the Snowflake-managed application role for the foundational models you want to ingest to your Snowflake Cortex AI integration role. For example:
    • Access to all foundational models: GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-ALL" TO ROLE <your_integration_role>;
    • Access to specific foundation models: GRANT APPLICATION ROLE SNOWFLAKE."CORTEX-MODEL-ROLE-CLAUDE-HAIKU-4-5" TO ROLE <your_integration_role>;

For more information on granting permissions to Snowflake roles, go to the Snowflake Cortex AI documentation.

Steps

  1. On the main toolbar, click Products icon Catalog.
    The Catalog homepage opens.
  2. Click Integrations.
    The Integrations page opens.
  3. Click the Integration configuration tab.
  4. In the Connection name column, locate the Snowflake connection that you used when you added the Snowflake (In preview) capability and click the link in the Capabilities column.
    The Synchronization page opens.
  5. In the Synchronization Configuration section, click Add Configuration.
  6. Go to the General settings tab and add the System asset which represents your Snowflake connection. This asset is used by the Snowflake Cortex AI workflow for run tracking.
    Note If this asset doesn't exist yet, you must create it before synchronizing your Snowflake Cortex AI integration.
  7. If you want to refresh the dynamic fields pulled from Snowflake Cortex AI, click Updated: <timestamp> next to Synchronization Configuration. The timestamp indicates the last time when the data was loaded from Snowflake Cortex AI.
    Note If the Updated: <timestamp> is green, you don't need to refresh as the data is up to date. If it is red, refresh the data. If you don't see the option, the metadata hasn't been refreshed yet. As metadata is refreshed daily, try checking again later or tomorrow.
  8. Go to the Cortex AI tab.
  9. In Domain mappings, map Snowflake models to Collibra domains using path patterns in the following format: DATABASE > SCHEMA > MODEL.
  10. In Exclude patterns, add path patterns that contain models you want the synchronization to skip in the following format: DATABASE > SCHEMA > MODEL. Excluded patterns are always considered first, regardless of what is added in Domain mappings.
  11. In Fallback domain, select the default domain for models that don't match any path pattern added in Domain mappings. If you don't select a Fallback domain, unmatched models are skipped.
    Note Fallback domain is required if you haven't added a Domain mapping. Both can be used simultaneously to ensure no models are skipped.
  12. Optionally, in Custom AI Metrics Mappings, define which custom Snowflake Cortex AI model metrics you want to integrate. You do this by adding the mapping between the custom metric and the Collibra attribute. The attribute list contains all attribute types that are assigned to the Snowflake Cortex AI Model asset type.

    After you synchronize the capability, the specified custom Snowflake Cortex AI Model metrics are mapped to the corresponding attributes.

  13. Optionally, expand Advanced settings and set the Asset deletion mode:
    1. Soft: Assets are archived and can be restored if the model reappears. This is the recommended mode.
    2. Hard: Assets are permanently deleted and can't be recovered.
  14. Click Save.
  15. Click Synchronize.
    A notification indicates the synchronization has started.
  1. On the main toolbar, click Products icon Catalog.
    The Catalog homepage opens.
  2. Click Integrations.
    The Integrations page opens.
  3. Click the Integration configuration tab.
  4. In the Connection name column, locate the Snowflake connection that you used when you added the Snowflake (In preview) capability and click the link in the Capabilities column.
    The Synchronization page opens.
  5. In the Synchronization Configuration section, click Add Configuration.
  6. Go to the General settings tab and add the System asset which represents your Snowflake connection. This asset is used by the Snowflake Cortex AI workflow for run tracking.
    Note If this asset doesn't exist yet, you must create it before synchronizing your Snowflake Cortex AI integration.
  7. If you want to refresh the dynamic fields pulled from Snowflake Cortex AI, click Updated: <timestamp> next to Synchronization Configuration. The timestamp indicates the last time when the data was loaded from Snowflake Cortex AI.
    Note If the Updated: <timestamp> is green, you don't need to refresh as the data is up to date. If it is red, refresh the data. If you don't see the option, the metadata hasn't been refreshed yet. As metadata is refreshed daily, try checking again later or tomorrow.
  8. Go to the Cortex AI tab.
  9. In Domain mappings, map Snowflake models to Collibra domains using path patterns in the following format: DATABASE > SCHEMA > MODEL.
  10. In Exclude patterns, add path patterns that contain models you want the synchronization to skip in the following format: DATABASE > SCHEMA > MODEL. Excluded patterns are always considered first, regardless of what is added in Domain mappings.
  11. In Fallback domain, select the default domain for models that don't match any path pattern added in Domain mappings. If you don't select a Fallback domain, unmatched models are skipped.
    Note Fallback domain is required if you haven't added a Domain mapping. Both can be used simultaneously to ensure no models are skipped.
  12. Optionally, in Custom AI Metrics Mappings, define which custom Snowflake Cortex AI model metrics you want to integrate. You do this by adding the mapping between the custom metric and the Collibra attribute. The attribute list contains all attribute types that are assigned to the Snowflake Cortex AI Model asset type.

    After you synchronize the capability, the specified custom Snowflake Cortex AI Model metrics are mapped to the corresponding attributes.

  13. Optionally, expand Advanced settings and set the Asset deletion mode:
    1. Soft: Assets are archived and can be restored if the model reappears. This is the recommended mode.
    2. Hard: Assets are permanently deleted and can't be recovered.
  14. Click Save.
  15. In the Synchronization Schedule section, click Add schedule.
  16. Enter the required information and click Save:
    FieldDescription
    RepeatThe interval when you want to synchronize automatically. The possible values are: Daily, Weekly, Monthly, and Cron expression.
    Cron

    The Quartz Cron expression that determines when the synchronization takes place.

    This field is only visible if you select Cron expression in the Repeat field.

    Every

    The day on which you want to synchronize, for example, Sunday.

    This field is only visible if you select Weekly in the Repeat field.

    Every first

    The day of the month on which you want to synchronize, for example, Tuesday.

    This field is only visible if you select Monthly in the Repeat field.

    At

    The time at which you want to synchronize automatically, for example, 14:00.

    • You can only schedule on the hour. For example, you can add a synchronization schedule at 8:00, but not at 8:45.
    • This field is only visible if you select Daily, Weekly, or Monthly in the Repeat field.
    Time zoneThe time zone for the schedule.

What's next

The synchronization job synchronizes the Snowflake Cortex AI data.
After the synchronization:

Helpful resources

Check out the A human-centered intro to AI integrations course in Collibra University.