Upgrade Spark
Important You must upgrade to Java 17 and Spark 3.5.6 to install and use Data Quality & Observability Classic versions 2025.08 or newer. For more information about version compatibility, click "Java and Spark compatibility matrix" below.
This section details how to upgrade on-premises Apache Spark versions.
Steps
Note The below steps are intended to reflect examples of what works in a simple Standalone environment. You may need to modify them to accommodate your specific deployment.
- Run the following commands to stop Spark and Collibra DQ services.
- Set the OWL_HOME variable without a trailing slash after the value.Copy
export OWL_HOME=<the owl folder where CDQ is installed> - Stop Spark Master.
- Stop Spark Worker.
- Stop DQ Web.
- Stop DQ Agent.
- Verify that all processes are stopped.
Example
export OWL_HOME=/home/ec2-user/owlCopycd $OWL_HOME/spark/sbin
./stop-master.shCopycd $OWL_HOME/spark/sbin
./stop-worker.shCopycd $OWL_HOME/bin
./owlmanage.sh stop=owlwebCopycd $OWL_HOME/bin
./owlmanage.sh stop=owlagentCopyps -ef | grep owl
## No DQ processes should return as running. ##Tip If any DQ processes are still running, you can use the command
kill -9 <pid from the ps command in Step 1f>to stop them manually. - Set the OWL_HOME variable without a trailing slash after the value.
- Create a package installation folder one level up from OWL_HOME and set the environment variables accordingly.
- Create a backup folder.
- Download the Collibra DQ installer packaged with the necessary Spark version from the Data Quality & Observability Classic Downloads page in the Product Resource Center.
- Run the following command to extract the installer.
- Create backups of the following directories and files by moving or copying them to the backup folder.
- spark
- spark-extras (if it exists)
- drivers
- dq-core-*-dependencies.jar
- dq-webapp*.jar
- dq-agent*.jar
- owlcheck
- owlmanage.sh
- Remove the folders and files using the following or similar steps.
- Open the OWL_HOME/bin folder.Copy
cd $OWL_HOME/bin - Move the dq-core-*-dependencies.jar, dq-webapp*.jar, dq-agent*.jar, owlcheck, and owlmanage.sh files to the INSTALLER_DIRECTORY/backup/ folder.Copy
mv dq*.jar owl* $INSTALLER_DIRECTORY/backup/ - Move up one folder level.Copy
cd .. - Move drivers, spark, and spark-extras to the INSTALLER_DIRECTORY/backup/ folder.Copy
mv drivers spark spark-extras $INSTALLER_DIRECTORY/backup/
- Open the OWL_HOME/bin folder.
- Confirm that the files and folders are in the backup folder.Copy
ls -ltr $INSTALLER_DIRECTORY/backup/ - Run the following command to go to the packages/install-packages folder.
- Extract the Spark installation files.Copy
tar -xvf spark-3.5.6-bin-hadoop3.tgz - Extract the spark-extras.tar.gz file.
- Rename and move the spark-3.5.6-bin-hadoop3 file from the installer extract folder to the OWL_HOME/spark folder.
- Run the following command to copy the spark-extras folder into spark/jars.
Copy
cp spark-extras/* $OWL_HOME/spark/jars - Add the following line of code to owl/spark/conf/spark-env.sh (which can be copied from the spark-env.sh.template file provided), to automate the cleanup of Spark work folders and avoid filling up disk space. Copy
export SPARK_WORKER_OPTS="${SPARK_WORKER_OPTS} -Dspark.worker.cleanup.enabled=true -Dspark.worker.cleanup.interval=1800 -Dspark.worker.cleanup.appDataTtl=3600" - Copy in the Data Quality & Observability Classic application files.
- Open the INSTALLER_DIRECTORY.
- Copy in the dq-core-*-dependencies.jar, dq-webapp*.jar, and dq-agent*.jar files.
- Copy in the owlcheck and owlmanage.sh files.
- Make the owlcheck and owlmanage.sh files executable.
Copycd $INSTALLER_DIRECTORYCopycp dq-*.jar $OWL_HOME/binCopycp owl* $OWL_HOME/binCopychmod +x $OWL_HOME/bin/owl* - Extract and copy in the updated drivers.
- Run the following commands to start Spark and Collibra DQ services.
- Start Spark Master.
- Start Spark Worker.
- Start DQ Web.
- Tail the log file until you see a "Bootstrap process complete" message.Copy
tail -f $OWL_HOME/log/owl-web.log - Start DQ Agent.
Copycd $OWL_HOME/spark/sbin
./start-master.shCopycd $OWL_HOME/spark/sbin
./start-worker.sh spark://$(hostname -f):7077Copycd $OWL_HOME/bin
./owlmanage.sh start=owlwebExample
Copycd $OWL_HOME/bin
./owlmanage.sh start=owlagent - Visit the Spark page on port 8080 to confirm that the Spark version is the one you installed and the worker is alive.
- Create a test DQ job without any optional DQ layers or rules to verify that the Spark driver, executors, and containers are able to launch successfully.
Copy
tar -xvf <downloaded .tar.gz file>
Copy
cd $INSTALLER_DIRECTORY/packages/install-packages
Example You can use the command tar -xvf spark-3.5.6-bin-hadoop3.tgz to extract Spark 3.5.6.
Copy
tar -xvf spark-extras.tar.gz