Troubleshooting: sample data
1 You receive an error code
| Code | Description | Possible causes | Solution |
|---|---|---|---|
| 400 |
This message appears if:
|
|
|
| 401 |
This message appears if you are not authenticated to use the sampling API. |
The authentication failed. |
Provide valid credentials. |
| 403 | This message appears if you lack permission to any of the columns within the requested asset. |
You do not have the required permissions. Both View permission and View Samples permission are needed to see sample data for an asset. |
Verify the user has the required permissions. |
| 404 | This message appears if the asset cannot be found. | The asset does not exist. | Provide an existing column or table asset id. |
| 503 | This message appears if the Edge service gets a timeout or fails. | The Edge service is not available. | Verify that the Edge site is still online and healthy. If not, check the Edge logs to get a better understanding of the issue. If the problem persists, contact Collibra Support for assistance. |
2 You receive error message: There is no matching sampling capability found
Issue: You receive the following error message: There is no matching sampling capability found for connection [connection_id].
Reason: This message appears when you open a Column or Table asset page for a data source that has been registered via Edge but for which the Edge site doesn't have an associated Edge capability for sampling.
Solution: To solve the issue, install the Catalog JDBC Sampling capability for the data source. The message provides the id of the Edge connection linked to the data source.
3 No sample data is displayed
There are many conditions that can result in no sample data being displayed. Before reporting an issue, check the following:
| Cause | Description | Solution |
|---|---|---|
| The setting Maximum number of samples is set to 0. | The sampling feature is disabled and no samples are displayed. |
Set the Data Profiling setting Maximum number of samples to a value higher than 0. See Configuring the use of sample data |
| The sampling capability is missing for your Edge data source. | Samples can only be extracted if the sampling capability is set for the data source on the corresponding Edge site. | Install the Catalog JDBC Samplingcapability for the data source. |
| The asset for which you want to collect sample data has no data. | There is no data to show for the asset. | |
|
No sample data is stored in the Collibra cloud repository. |
|
Configuring the use of sample data |
4 You always see old sample data for a data source registered via Edge
Sample data stored in the Collibra cloud repository takes precedence over sample data extraction by Edge. Sample data can be available for an Edge data source in the Collibra cloud repository if this data source was previously connected to Jobserver or if sample data was pushed using the Catalog profiling REST API for the data source.
If you want to remove samples from the Collibra cloud repository, see Delete sample data.
See also Understanding the process to display sample data.
5 Collecting the sample data is very slow
- It can take some time to read and display the sample data available in the Edge cache.
- The sample data extraction time via Edge is influenced by multiple factors. For example: table size, number of columns in a table, number of samples to collect, maximum length of samples, and push-down sampling mechanism available for the data source. For more details go to Sample data beta feature limitations and guidelines.
6 Retrieving sample data log files
For data sources registered via Edge, Edge logs are generated when sample data is extracted from the data source and cached on the Edge site. The logs start with this text: "Writing cache samples with the key...".
Looking at the Edge logs within a 2-day period should give information on the sampling activity.
Writing cache samples with the key 'catalog.sample.6385e23cb1ae443a7786c555108d8bb028d23dee39e76ce3169eaa9cdacb1ed3'
"Cache write sample for table 'Snowflake>SNOWFLAKE_SAMPLE_DATA>TPCDS_SF100TCL>CALL_CENTER'