Warning  We have announced the end of life of Jobserver and all related Jobserver integrations for September 30, 2024, with the exception of Public Sector customers using GovCloud or on-prem environments. For more information, go to Announcements.

Data type detection via Jobserver

When you run a data profiling when registering a data source, Collibra Data Intelligence Platform tries to detect the data type of each column.

  1. Collibra tries to match the fields of each column with every data type.
  2. Collibra remembers the matches for each field, also if a field has multiple matches.
  3. Collibra calculates the matching percentage of how many fields of the column match the same data type.
  4. Collibra verifies the matching percentage against the data type detection threshold.
    Tip You can define the data type detection threshold in Collibra Console, see the Collibra Installation and Configuration Guide.
  5. Collibra assigns the data type with the highest matching percentage to the source column, provided that the matching percentage exceeds the threshold.

Out of the box, there are several base data types such as integer, text and boolean. With each data profiling, these base data types are evaluated. If your data source contains special data types such as social security numbers or international bank account numbers, you can define them as advanced data types. In the data source registration wizard, you can then choose to also evaluate the data on these advanced data types.

Keep in mind that detecting advanced data types significantly increases the data profiling job execution time.