Documentation

Configuring Apache Hadoop to execute DQ Jobs

For large-scale processing and concurrency, a single vertically scaled Spark server may not be sufficient. To address large-scale processing, Data Quality & Observability Classic can push compute jobs to an external Hadoop cluster. This section describes how to configure the DQ Agent to push DQ jobs to Hadoop.

The following diagram shows the Data Quality & Observability Classic architecture with a Hadoop cluster: