Sample data

Important 

In Collibra 2024.02, we've launched a new user interface (UI) in beta for Collibra Data Intelligence Platform! You can learn more about this latest UI in the UI overview.

Use the following options to see the documentation in the latest UI or in the previous, classic UI:

What is sample data?

Sample data is a set of randomly collected data from a data source. Depending on your environment, sample data can be displayed for Table, Column, or Data Set assets. The purpose of showing sample data is to provide examples of the data so you know what to expect when you use the asset.

When and where can you see sample data?

You can view sample data for an asset only if the following conditions are met:

If sample data is available, it is displayed in:

Asset type If Catalog experience is active, you can see the sample data in: If Catalog experience is not active, you can see the sample data in:
Table Summary tab pane
Sample data tab pane
Details tab pane
Sample data tab pane
Column Summary tab pane
Data profiling tab pane
Details tab pane
Sample data tab pane
Data Set

Summary tab pane
Sample data tab pane

Details tab pane
Sample data tab pane

Note Currently, you can request sample data via Edge only for Table and Column assets.

Asset type Location of sample data
Table Sample Data tab
Column Summary tab: Descriptive Statistics section
Data Set

Sample Data tab

Note Currently, you can request sample data via Edge only for Table and Column assets.

Tip 
In Table and Data Set assets, you see sample data only for columns for which you have the required permission. If you do not have access, you see the text <sensitive> in the column instead of sample data.

How does Collibra handle sample data?

The way Collibra handles sample data depends on how the assets are added in Collibra and how the sample data is collected:

 

Assets are created by registering a data source via Edge.

Assets are created by registering a data source via Jobserver.

Assets are manually added or imported.

Sample data for an asset is uploaded via the Catalog REST API - Profiling.

The sample data is stored in the Collibra cloud repository.

The sample data is displayed to all users with the required permissions.

The sample data is stored in the Collibra cloud repository.

This sample data is also used for data classification via the Data Classification Platform.

The sample data is displayed to all users with the required permissions.

The sample data is stored in the Collibra cloud repository.

The sample data is displayed to all users with the required permissions.

Sample data is collected and stored when the data source is registered via Jobserver.

See Configure the use of sample data via Jobserver.

Not applicable.

The sample data is stored in the Collibra cloud repository.

This sample data is also used for data classification via the Data Classification Platform.

The sample data is displayed to all users with the required permissions.

Not applicable.

Sample data can be manually requested for an asset that is registered via the Edge register data source process.

The randomly collected sample data is cached on the Edge.
No sample data is stored in the Collibra cloud repository.

The sample data is only displayed to users with the required permissions and if the sample data has been requested.

Sample data is not anonymized because having access to the data examples via Edge is based on permissions and the sample data is not stored in Collibra.

Note 
  • Currently, you can request sample data via Edge only for Table and Column assets.
  • We randomly collect rows from the data source. The data of the randomly collected rows, however, is not switched around, we display all data for each randomly collected row.

Not applicable.

 

Not applicable.

 

For details on the process, go to Understanding the process to display sample data.
For details on the sample data limitations and guidelines, go to Limitations and guidelines.