Pullup job sizing recommendations

Updated: April 13, 2026

To ensure optimal performance, efficient resource utilization, and successful job execution, you can fine-tune the job resource settings from the global job limits page. This is especially important when dealing with exceptionally large sets of data.

The following table shows recommendations on how to scale your job in different circumstances.

High-level resource targets

Depending on your data size, your job requires a total target amount of RAM and CPU cores to run efficiently. The following table outlines these high-level capacity targets.

Total rows	Total columns	Total cores	Total RAM
100K	50	2	3 GB
1M	50	3	6 GB
10M	50	10	52 GB

Tip If your table is extremely large, use Parallel JDBC to improve the performance of the data loading stage when the Pullup job runs.

Calculating total resources

The global limits page does not accept "total" inputs, as described in the previous table. Instead, Spark distributes these total resources between a single driver node and multiple executor nodes. The underlying engine allocates resources based on these two formulas:

Total cores = Driver cores + (Number of executors × Executor cores)
Total RAM = Driver memory + (Number of executors × Executor memory)

Example For a table with 10M rows, you can achieve a target of 6 total cores and 16 GB total RAM by configuring 1 driver with 4 GB RAM, plus 2 executors that each have 2 cores and 6 GB RAM.

Recommended settings on the global limits page

To achieve the total targets described in the previous table, use the following baseline settings for the specific fields on the global limits page.

Note The following values are intended for illustrative purposes only.

Total rows	Total columns	Maximum number of driver cores	Maximum driver memory	Maximum number of executors	Maximum executor cores	Maximum executor memory
100K	50	1	2 GB	2	2	4 GB
1M	50	1	2 GB	1	2	4 GB
10M	50	1	4 GB	4	2	16 GB

Additional settings

For the remaining fields on the global limits page, use the following guidelines unless your specific workload dictates otherwise:

Maximum worker cores: Set this equal to or slightly higher than your executor cores.
Maximum worker memory: Set this equal to or slightly higher than your executor memory to allow for system overhead.

What's next

Ensure that your Edge or Collibra Cloud site meets all system requirements.
Create a Pullup job.