Installing Collibra Data Quality & Observability on Spark Standalone Self-hosted

Collibra DQ can be installed and operated on a single standalone host, which is useful when large scale and high concurrency checks are not required. In this mode, DQ leverages a Spark Standalone pseudo cluster in which the master and workers run and use resources from the same server.

The DQ Standalone application consists of the following components:

  • DQ Web
  • DQ Agent
  • DQ Metastore (PostgreSQL database)
  • Spark (pullup only)

Fig 1: Architecture overview of Full Standalone Installation mode

Collibra DQ provides the option to include the PostgreSQL metastore in the installation or use an external PostgreSQL metastore (recommended).

What's next?

Before You Install Standalone