Thomas Zeiser

Some comments by Thomas Zeiser about HPC@RRZE and other things


combining several sequential jobs in one PBS job to fill a complete node

Sometimes trivial parallelism is the most efficient way to parallelize work, e.g. for parameter studies with a sequential program. If only complete nodes may be allocated on a certain cluster, several sequenatial runs can very easily be bundled into one job file:

#!/bin/bash -l
# allocate 1 nodes (4 CPUs) for 8 hours
#PBS -l nodes=1:ppn=4,walltime=08:00:00
# job name
#PBS -N  xyz
# first non-empty non-comment line ends PBS options

# jobs always start in HOME
# but we want to go to the directory where we submitted the job

# run 4 sequential parameter studies in parallel and bind eachone
# to a specific core
(taskset -c 0  ./a.out input1.dat ) &
(taskset -c 1  ./a.out input2.dat ) &
(taskset -c 2  ./a.out input3.dat ) &
(taskset -c 3  ./a.out input4.dat ) &

# wait for all background processes to finish ("wait" is a bash built-in)

The bash builtin wait ensures that all background processes have finished once wait returns.

For this to work efficiently, of course all parameter runs should take about the same time …