Profiling and classification via Edge

When you register a data source, the Jobserver ingests data into Data Catalog. After that, an Edge site initiates the profiling and classification process and sends the results to Collibra Data Intelligence Cloud.

Prerequisites

  • You have created a support ticket in Zendesk to request access to Edge.
  • You have created and installed an Edge siteEdge site.
  • You have an Edge site role with the following global permissions:
    • Data Catalog
    • Register Profiling Information
  • You have a role with the following resource permissions on the community:
    • Asset: add
    • Attribute: add
    • Domain: add
    • Attachment: add
    Note These permissions are always necessary when registering a data source.

Steps

  1. Enable Profiling and ClassificationEnable Profiling and Classification on Edge in Collibra Console.
  2. Register a data source and do the following:
    1. Optionally, enable push down sampling.
    2. In the Profiling options, click Profile and classify data.
    Collibra Data Intelligence Cloud first ingests metadata via Jobserver and then profiles and classifies the data via the profiling capability of an Edge site.

Tip Collibra Data Intelligence Cloud only has access to ingested metadata, anonymized profiling results and classification suggestions, but not actual data from your data source.

Configurations in Collibra Console

If you use profiling and classification via Edge, some configurations in Collibra Console are no longer relevant:

Section

Setting

Description

Data profiling

Anonymize data

This setting is no longer relevant. Profiling data is automatically anonymized on an Edge site before it reaches Collibra Data Intelligence Cloud.

Cloud Classification configuration

Enable data classification

If the Enable data classification is set to True, profiling and classification on Edge is disabled. As a result, you can classify data via the Data Classification platform, by clicking the Classify button on Column and Table asset pages instead of automatically via Edge.