Configure the use of sample data via Edge

You must configure your Collibra environment if you want to display sample data for data sources registered via Edge.

Important This is a Beta feature. See Beta feature: limitations and guidelines.

Warning Sample data for data sources registered via Edge is temporarily cached on the Edge site. In the cache, the sample data is not encrypted. This means that the data is available in clear text in the Edge cache for 24-48 hours. Only the key that allows to identify the sample data's origin is encrypted.

  Configuration step More details
1 Ensure the users have the required permissions. See Required permissions to view sample data.

Important Several out-of-the-box roles already include those required permissions. Review the permissions assigned to those roles before enabling the public beta feature.

2
  • Activate the public beta feature via Data Profiling setting Sample data on Edge.
  • Set the Data Profiling setting Maximum number of samples to a value higher than 0.

Warning For performance reasons, do not set this setting higher than 1,000. The limit of 1,000 will be enforced in a later release.

Important The Maximum number of samples value applies to both Jobserver and Edge. In mixed environments, increasing the value can result in sample data extraction for data sources registered via Jobserver.
    ClosedShow me how

    Depending on your environment, you have to follow this procedure either in the Services Configuration section of the Collibra settings or in Collibra Console:

    Prerequisites

    Steps

    1. Open the Services Configuration page.
      1. On the main menu, click , then Settings.
        The Collibra settings page opens.
      2. In the tab pane, click Services Configuration.
      Open the DGC service settings for editing:
      1. Open Collibra Console.
        Collibra Console opens with the Infrastructure page.
      2. In the tab pane, expand an environment to show its services.
      3. In the tab pane, click the Data Governance Center service of that environment.
      4. Click Configuration.
      5. Click Edit configuration.
    2. Go to the Data profiling section.
    3. In Sample data on Edge, select True.
    4. Make sure the setting Maximum number of samples is higher than 0.
      The default value is 100. See also DGC service configuration: options.
    5. Click the green Save all button.
3 For each data source, add the following Edge capability: Catalog JDBC Sampling.

For information on how to add capabilities, see Add an Edge capability to an Edge site.

  • The Catalog JDBC Sampling capability allows to collect and cache sample data for a given JDBC data source in the Edge site, and can retrieve sample data from the Edge cache.
  • Once the capability is selected, define the JDBC connection to which the capability applies.

For detailed information on the sample data process, see Understanding the process to display sample data.