Data profiling information

If you create a data profile of registered data, profiling results are generated in the table and column assets.

  • If you use Jobserver to register the data source, data profiling information depends on the profile options that you selected when you registered the data source.
  • If you use Edge to register the data source, most information is only available after you specifically profiled the data. For an overview of the data that becomes available after the registration of a data source via Edge, see Data source registration information.
Tip 
Column attribute Profiling option Statistics Description Retrieved from JDBC property
Column Name

No option selected

N/A

The column name of the registered table. COLUMN_NAME
Data Type

Store Data Profile

If you want to have Advanced Data Type detected, select Detect advanced data types

N/A

The data type of the column. This type is detected by the profiling process. This can differ from the Technical Data Type value.

For example, if a database has a column with text as data type, and the column contains only integer values, the profiling process will set the Whole Number data type instead of text.

If you enable the Anonymize data option in Collibra Console, Collibra anonymizes data in Column assets that have data type Text and Geo.

If the profiling process has detected a wrong data type, you can update it afterwards.

Collibra anonymizes data in Column assets that have data type Text and Geo.

 
Description from Source No option selected N/A The description of the column in the data source. REMARKS
Row Count Store Data Profile

Exact

The number of rows in the data source.  
Empty Values Count Store Data Profile

Exact

The number of rows that are empty.  
Number of distinct values Store Data Profile

Exact or approximate depending on column cardinality

The number of unique values in the column.  

Chart

Store Data Profile

Depending on chart type

This column displays whether a chart was generated () for the column or not (no icon available).

If you hover over the icon, you see a preview of the chart.
The chart type varies per data type. Following charts available:

  • Frequency chart
  • Histogram that shows distribution
  • Probability distribution curve
Note 

Charts are not available for the following data types:

  • Data type = Text and Categorical Data = false
  • Data type = Array
  • Data type = N/A
 
Frequency

Store Data Profile

Exact or approximate depending on column cardinality

A bar chart showing frequency data.

 
Distribution - Histogram

Store Data Profile

Approximate

A histogram showing the representation of the distribution of numerical data.

 
Distribution - Probability distribution curve

Store Data Profile

Approximate

A curve showing the representation of the probability distribution of numerical data.

 
Technical Data Type No option selected

N/A

Data type of the column as defined in the source. This value can differ from the Data Type value. TYPE_NAME
Descriptive statistics (decile, percentile, quartiles) Store Data Profile

Approximate

The value of the calculated statistic of the registered data.  
Categorical Data Store Data Profile

Exact or approximate depending on column cardinality

Indication whether the data in the column is categorical or not.

For example, if 100 000 rows are registered and there are only five distinct values, then the data is considered to be categorical.

 
Categories Store Data Profile

Exact or approximate depending on column cardinality

List of detected categories. This column has only values if the data is categorical.  
Char octet Length No option selected

N/A

Maximum number of bytes in a character type's column. CHAR_OCTET_LENGTH
Column Position No option selected

N/A

The index of the column in the source table. ORDINAL_POSITION
Is Auto Incremented No option selected

N/A

Indication whether the data in the column is auto-incremented or not. IS_AUTOINCREMENT
Is Generated No option selected

N/A

Indication whether the data in the column is generated or not. IS_GENERATEDCOLUMN
Is Nullable No option selected

N/A

Indication whether the column can store NULL values or not. IS_NULLABLE
Is Primary Key No option selected

N/A

Indication whether the column is a primary key or not. True if the primary keys resultSet contains the COLUMN_NAME
Maximum Text Length Store Data Profile

Exact

The length of the longest text value in the column, including white spaces.  
Maximum Value Store Data Profile

Exact

The maximum value in the column.  
Mean Store Data Profile

Exact

The mean of all the values in the column, excluding empty rows.  
Median Store Data Profile

Exact

The median value of the column.  
Minimum Text Length Store Data Profile

Exact

The length of the shortest text value in the column.  
Minimum Value Store Data Profile

Exact

The minimum value in the column.  
Mode Store Data Profile

Exact or approximate depending on column cardinality

The value with the highest frequency for categorical data.  
Number Of Fractional Digits No option selected

N/A

The number of fractional digits. DECIMAL_DIGITS
Primary Key Name No option selected

N/A

The name of the primary key composed by the column. PK_NAME
Size No option selected

N/A

The size of the column in the table. COLUMN_SIZE
Standard Deviation Store Data Profile

Exact

The statistical standard deviation of numeric values.  
Variance Store Data Profile

Exact

The statistical variance of numeric values.  
Sample Store Sample Data

N/A

A random sample of the data set that represents the entire data set.

Note In Edge, viewing sample data is not linked to the profiling feature. See sample data.

 
Table attribute Profiling option Statistics Description From JDBC property
Table Name

No option selected

N/A

The table name in the data source. TABLE_NAME
Table Type No option selected

N/A

The table type in the data source, such as TABLE or VIEW. TABLE_TYPE
Description from Source No option selected N/A The description of the table in the data source. REMARKS