Before You Install Standalone
Prerequisites
- Review the Standalone architecture described in Installing Collibra Data Quality & Observability on Spark Standalone Self-hosted.
- Verify that your system meets the System Requirements.
Download and extract the DQ package
Download the full package tarball from the Collibra Product Resource Center and place it in a directory on CDQ VM, to be extracted later for the install:
Extract the file:
Copy
tar -xvf dq-full-package.tar.gz
(Optional) Clean up:
Copy
rm dq-full-package.tar.gz
Set the environment variables
Set the following environment variables:
- OWL_BASE: The directory to install DQ files
- OWL_METASTORE_USER: The default username for the PostgreSQL server
- OWL_METASTORE_PASS: The default password for the PostgreSQL server. It must adhere to the following password policy:
- Minimum length of 8 characters
- Maximum length of 72 characters
- At least one upper-case letter
- At least one numeric character
- At least one special character (supported are !,%,&,@,#,$,^,,?,_,~)*
- Password cannot contain the PostgreSQL username
- SPARK_PACKAGE
- DQ_ADMIN_USER_PASSWORD
For example, execute the following commands to set the environment variables:
Copy
export OWL_BASE=$(pwd)
export OWL_METASTORE_USER=postgres
export OWL_METASTORE_PASS=H55Mt5EbXh1a%\$aiX6
export SPARK_PACKAGE=spark-3.5.3-bin-hadoop3.2.tgz
export DQ_ADMIN_USER_PASSWORD=<password>