Thomas Zeiser

Some comments by Thomas Zeiser about HPC@RRZE and other things

Content

combining several sequential jobs in one PBS job to fill a complete node

Sometimes trivial parallelism is the most efficient way to parallelize work, e.g. for parameter studies with a sequential program. If only complete nodes may be allocated on a certain cluster, several sequenatial runs can very easily be bundled into one job file:

#!/bin/bash -l
# allocate 1 nodes (4 CPUs) for 8 hours
#PBS -l nodes=1:ppn=4,walltime=08:00:00
# job name
#PBS -N  xyz
# first non-empty non-comment line ends PBS options

# jobs always start in HOME
# but we want to go to the directory where we submitted the job
cd  $PBS_O_WORKDIR

# run 4 sequential parameter studies in parallel and bind eachone
# to a specific core
(taskset -c 0  ./a.out input1.dat ) &
(taskset -c 1  ./a.out input2.dat ) &
(taskset -c 2  ./a.out input3.dat ) &
(taskset -c 3  ./a.out input4.dat ) &

# wait for all background processes to finish ("wait" is a bash built-in)
wait

The bash builtin wait ensures that all background processes have finished once wait returns.

For this to work efficiently, of course all parameter runs should take about the same time …