Troubleshooting sample data

Tip Make sure your Edge site meets the Sample data requirements. For information, go to Edge hardware requirements to show sample data.

1 Message: To ensure data security, sample data is currently not visible

Issue:
When you open sample data for an asset, no data is displayed and you see the following message in the page: To ensure data security, sample data is currently not visible.

Possible reasons:

  • You don't have the required permissions to view sample data.
  • The Catalog JDBC Sampling capability has not been defined for the data source Edge connection.

Solution:

2 Message codes

Code Description Possible reasons Solution
200 This code indicates the sample data processes ran correctly. Also when no sample data is available in the data source, this code is provided.    
400

This message appears if:

  • Something is wrong with the provided asset ID
    or
  • The sampling capability is not installed on the Edge site.

The error message will specify the problem.
  • The asset exists but the asset is not a column or table
  • The table has no columns.
  • Something is wrong in the relationship of the column, table or database, like a column asset that was not ingested but manually created and no relationship has been defined.
  • The Catalog JDBC Sampling capability has not been defined for the data source Edge connection.
  • If it concerns a wrong asset, provide a valid column or table asset id.

  • If the sampling capability is missing,
add the Catalog JDBC Sampling capability for the data source.
401

This message appears if you are not authenticated to use the sampling API.

The authentication failed.

Provide valid credentials.

403 This message appears if you lack permission to any of the columns within the requested asset.

You do not have the required permissions. Both View permission and View Samples permission are needed to see sample data for an asset.

Verify the user has the required permissions.
404 This message appears if the asset cannot be found. The asset does not exist. Provide an existing column or table asset id.
503 This message appears if the Edge service gets a timeout or fails. The Edge service is not available. Verify that the Edge site is still online and healthy. If not, check the Edge logs to get a better understanding of the issue. If the problem persists, contact Collibra Support for assistance.

3 Error message: Generic API exception PayloadTooLarge

Issue:
When you open sample data for a Table asset, you receive the following message:com.collibra.edge.management.exceptions.PayloadTooLarge: The payload size should be below 102400 bytes.

Reason: The table contains too many columns.

Solution: You can open an individual Column asset to request its sample data.

4 Error message: There is no matching sampling capability found

Issue:
You receive the following error message:
There is no matching sampling capability found for connection [connection_id].

Reason:
This message appears when you open a Column or Table asset page for a data source that has been registered via Edge but for which the Edge site doesn't have an associated Edge capability for sampling.

Solution:
To solve the issue, add the Catalog JDBC Sampling capability for the data source. The message provides the id of the Edge connection linked to the data source.

5 No sample data is displayed

There are many conditions that can result in no sample data being displayed. Before reporting an issue, check the following:

Reason Description Solution
The setting Maximum number of samples is set to 0. The sampling feature is disabled and no samples are displayed.

Set the Data Profiling setting Maximum number of samples to a value higher than 0.

For details, Configuring the use of sample data.

The sampling capability is missing for your Edge data source. Samples can only be extracted if the sampling capability is set for the data source on the corresponding Edge site. Add the Catalog JDBC Sampling capability for the data source.
The asset for which you want to collect sample data has no data. There is no data to show for the asset.  

No sample data is stored in the Collibra cloud repository.
(not applicable for data sources registered via Edge)

  • For Jobserver data sources, sample data is only available in the Collibra cloud repository if the Store Sample Data option was selected during the registration of the data source.
  • For assets created without Jobserver or Edge registration, sample data is only available if they were uploaded to the Collibra cloud repository via the Catalog Profiling REST API.
Configuring the use of sample data

6 You do not see the Request Sample Data button

Issue:
You want to see sample data for an asset but you cannot request it. The Request Sample Data button is not available.

Reason:
The possible reasons are:

7 I cannot request sample data for a data set via Edge

Currently, you can request sample data via Edge only for Table and Column assets.

8 You always get old sample data for a data source registered via Edge

Issue:
You always see old sample data for a data source registered via Edge.

Reason:
Sample data stored in the Collibra cloud repository takes precedence over sample data extraction by Edge. Sample data can be available for an Edge data source in the Collibra cloud repository if this data source was previously connected to Jobserver or if sample data was pushed using the Catalog profiling REST API for the data source. For more information on the process, go to Understanding the process to display sample data.

Solution:
If you want to remove samples from the Collibra cloud repository, go to Delete sample data.

9 Collecting the sample data is very slow via Edge

  • It can take some time to read and display the sample data available in the Edge cache.
  • The sample data extraction time via Edge is influenced by multiple factors. For example: table size, number of columns in a table, number of samples to collect, maximum length of samples, and push-down sampling mechanism available for the data source. For more details, go to Sample data limitations and guidelines.
  • Maybe a lot of parallel sample data requests are ongoing. This happens when a lot of users want to see sample data at the same time.

    Tip If you experience issues in this situation, you can decrease the number of Edge data sources for which the sampling capability is enabled.

10 You want to retrieve sample data log files

For data sources registered via Edge, Edge logs are generated when sample data is extracted from the data source and cached on the Edge site. The logs start with this text: "Writing cache samples with the key...".
Looking at the Edge logs within a 2-day period should give information on the sampling activity.

Example 

Writing cache samples with the key
'catalog.sample.6385e23cb1ae443a7786c555108d8bb028d23dee39e76ce3169eaa9cdacb1ed3'
"Cache write sample for table 'Snowflake>SNOWFLAKE_SAMPLE_DATA>TPCDS_SF100TCL>CALL_CENTER'