Data profiling of a column

Important 

In Collibra 2024.02, we've launched a new user interface (UI) in beta for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

In the Data Profiling tab of a Column asset, you can see the details of the column.

The details are grouped in some fixed sections:

Section Content
Metadata

Contains the metadata of the column, such as data type, column name and so on.

Basic Statistics

Contains the basic statistics of the data, such as minimum and maximum value.

Counts

Contains basic content information, such as number of rows and number of distinct values.

Depending on the column's data type, you can find extra sections:

Section Content
Quantiles

Contains the descriptive statistics of the data.

This section is only available if the data type is numerical.

Charts

Displays the statistics in a graphical way. The chart type varies per data type:

  • bar chart: textual, boolean, and numerical data that is considered categorical (Categorical Data = true).
  • data distribution chart: numerical and date and time data.

In the Summary tab of a Column asset, in the Descriptive Statistics section, you can see the statistics and metadata of the column. To learn more about descriptive statistics, go to the descriptive statistics wiki page.

The details are grouped in tabs:

Section Content
Frequency

A bar chart that shows the values of the different categories.
If there are too many categories, only the first 50 and last 50 values are displayed.

This section is available only for textual, boolean, and numerical data that is considered categorical (Categorical Data = true).

Distribution

A distribution chart for numerical and date and time data.

This section is available only if the data type is numerical or date and time.

Statistics

Contains the basic statistics of the data, such as number of rows, number of distinct values, minimum value, and maximum value.

Metadata

Contains the metadata of the column, such as column position and whether the column is the primary key.

Quantiles

Contains the quantiles of the data.

This section is available only if the data type is numerical.

The following information is available in the At a Glance side panel: Table Type , Data Source Type, Row Count, and Technical Data Type.

Note 
If you use Jobserver, you can anonymize columns with data type Text or Geo by enabling the Anonymize data feature in Collibra Console.
If you use Edge, the profiling results for columns with Text or Geo data type are automatically anonymized. You can anonymize all columns by enabling the Anonymize Edge profiling results for all data types feature.