Profiling via Edge
When you register a data source via Edge using an Edge site that also has a JDBC profiling capability, you can start the profiling and classification process via the Configuration tab page on the Database asset page. The Edge site then initiates the profiling and classification process and sends the results to Collibra Data Intelligence Cloud.
Note Collibra Data Intelligence Cloud only has access to synchronized metadata, anonymized profiling results and classification suggestions, but not actual data from your data source.
Limitations
Currently, profiling via Edge has a few limitations:
- Advanced data types are not supported.
- Only a limited number of data sources are supported.
- You have to register the data source via Edge and synchronize one or more schemas.
Profiling options
Before you create a data profile of registered metadata, you have to indicate whether you want to profile everything or only a sample. You can also enable an option to automatically profile and classify synchronized metadata.
| Option | Description |
|---|---|
| Automatically run when a metadata extraction is synchronized | Enable to automatically create a data profile and classify columns every time the synchronization process of one or more schemas finishes. |
| Full scan | Select to profile and classify based on all synchronized metadata. |
| Partial scan |
Select to profile and classify based on a sample of the synchronized metadata. When you select Partial scan, you can enter the maximum number of rows that you want to use for profiling and classification. By default, the maximum number of rows is 20000. Tip Edge uses push down sampling to create a random sample of the metadata. This option is only available for data sources that support push down sampling. |
Settings
Before you profile via Edge, you must consider the following settings.
| Section |
Setting |
Description |
|---|---|---|
| Register a data source |
An option to enable database registration via Edge.
Note Enabling data source registration via Edge does not prevent you from registering a data source via Jobserver as well. |
|
|
Data profiling |
This setting is no longer relevant. All profiled data is automatically anonymized. |
|
|
Data profiling |
An option to enable profiling and classifying synchronized metadata via Edge instead of Jobserver.
Note You can only enable Database profiling via Edge if you also enabled Database registration via Edge. |
|
|
Cloud Classification configuration |
Enable data classification |
If the Enable data classification option is set to |