Remote file connections
This section gives an overview of the supported file formats and the limitations when connecting to a remote file.
Supported file types
Because file formats differ in structure, you may need to prepare your data before establishing a connection.
Supported delimiters
The following table is a list of supported delimiters available in the Delimiter dropdown menu.
Type | Format | Description |
---|---|---|
Comma | CSV |
|
Tab | TSV | tab is used to separate values in the file. |
Semicolon | CSV | ; is used to separate values in the file. |
Double Quote | CSV | " is used to separate values in the file. |
Single Quote | CSV | ' is used to separate values in the file. |
Pipe | TXT | \| is used to separate values in the text file. |
SOH | TXT | A Unicode character 'START OF HEADING' (U+0001) is an invisible control character. |
Custom | N/A | Add a custom delimiter. Support for custom delimiters may vary. |
Known Limitations
- DQ jobs that run on remote file connections with headers that contain white spaces fail with a requirement failed exception message. A possible workaround is to edit the DQ Job command line in the Run CMD tab and place single quotes
''
around the column name in-q
and double quotes""
around the contents of the-header
flag. - Filtergram on the Data Preview tab is not available for any remote file connection. Currently, there is not a workaround for this limitation.
- Array and nested array datatypes in JSON files are not supported.
- While Collibra DQ supports most UTF-8 encoded characters in column headers of file-based connections, some Chinese characters are not currently supported. Jobs that run with this type of unsupported characters fail with a mismatched input exception message.
- When you use Validate Source, the Update Source Scope button is not available for remote files. Update Source Scope is only visible for JDBC connections.