About Edge capabilities

An Edge capability, like Sampling or S3 synchronization, is an application that can run on an Edge site. It can access a data source to extract and process data as needed. This data can be stored in an encrypted cache to improve the security of your data and platform. An Edge capability for a specific data source runs as a job and delivers the output to Collibra Data Intelligence Platform in a secure and reliable way.

An Edge capability has a capability template that defines a specific use case, for example, data source ingestion.

Capability templates

A capability template is developed for a specific task on a specific data source type. The capability template also determines which properties are available to configure the Edge capability.

 

Capability template Description
ADLS synchronization Used to connect to Azure Data Lake Storage (ADLS)
Catalog Data Classification

Used to classify data from a registered JDBC data source in the Edge site.

This capability can't be added to an Edge site that uses a MITM proxy.

Catalog JDBC ingestion
  • Used to register a data source and synchronize schemas from a data source via a JDBC connection.
  • This capability can't be added to an Edge site that uses a MITM proxy.

    Catalog JDBC Sampling

    Used to collect and cache sample data from a data source in the Edge site via a JDBC connection.

    Ensure that you meet the additional Catalog JDBC Sampling hardware requirements, in addition to the Edge site requirements.

    This capability can't be added to an Edge site that uses a MITM proxy.

    Collibra Protect for AWS Lake Formation

    Used to set up Protect for AWS Lake Formation.

    This capability can't be added to an Edge site that uses a MITM proxy.

    Collibra Protect for Google BigQuery

    Used to set up Protect for BigQuery.

    This capability can't be added to an Edge site that uses a MITM proxy.

    Collibra Protect for Snowflake

    Used to set up Protect for Snowflake.

    This capability can't be added to an Edge site that uses a MITM proxy.

    DQ Connector

    Used to ingest Collibra Data Quality & Observability user-defined rules, metrics, and dimensions into Collibra Data Catalog.

    This capability can't be added to an Edge site that uses a MITM proxy.

    GCS synchronization Used to connect to Google Cloud Storage.
    JDBC Profiling

    Used to profile and classify data from a registered data source.

    This capability can't be added to an Edge site that uses a MITM proxy.

    S3 synchronization Used to connect to Amazon S3.
    Databricks Unity Catalog synchronization Used to connect to Databricks Unity Catalog.
    Technical lineage capabilities

    Used to create technical lineage for different data sources. For details, go to: Add a technical lineage capability to an Edge site.

    Ensure that you meet the additional Technical Lineage minimum network requirements, in addition to the Edge site requirements.

    You can use a man-in-the-middle (MITM) proxy between Edge and the Collibra Data Lineage service instances. For details on which data sources support the use of proxies, go to Create a technical lineage via Edge, select your data source, and see our test results in the Connect to a proxy server section.

    Important While these capability templates are available for all customers, the features that you use them for might still be in beta.

    Capability template structure

    Each Edge capability template contains the following:

    File

    Description

    A manifest file (YAML)

    This file contains the capability metadata and input parameter requirements.

    A workflow file (YAML)

    This file defines the workflow and binds the parameters to capability containers.

    Docker images

    One or more Docker images that implement the business logic.

    Note Each type of capability has its own required custom properties. These properties appear after you select a capability template from the dropdown menu.