Anonymization of profiling results via Edge

If you profile and classify via Edge, Edge anonymizes the profiling results of columns with data of the type Text and Geo immediately at the end of the profiling process and before the results are sent to Collibra. The data is automatically anonymized for security reasons, because the profiling results are stored in Collibra.

As a result:

  • Values shown in the data distribution charts are replaced by a random hash value for columns that contain these data types. Also the attributes of the type Mode or Percentiles, are anonymized for columns with data type Text or Geo.
  • Identical values in a column get the same hash value so that you can still recognize the values as identical.
Example 

You have profiled and classified a column with Text data via Edge.
If you go to the Summary, Overview or Data Profiling tab, all profiling results for textual and geographical data are removed or replaced by hashed values.

Important 

It is possible that you do see sample data in full. The sample data is not anonymized because:

  • Having access to the data examples via Edge is based on permissions.
  • If you have permission to view sample data, the sample data will be collected and shown for a limited amount of time. The sample data is not stored in Collibra.

For more details, go to the Sample data documentation.

Note Edge detects the data type of a column during profiling and only anonymizes the results if the data type attribute is Text or Geo. However, if Edge detects a data type that does not correctly correspond with the actual data type, some data may not be anonymized or may have been wrongfully anonymized. To solve this, you can manually modify the column's data type and profile again.