Thomas Zeiser

Some comments by Thomas Zeiser about HPC@RRZE and other things


SGI Altix extension

Recently, the SGI Altix at RRZE has been extended. We now have a batch-only system altix-batch (an SGI Altix 3700 with 32 CPUs and 128 GB shared memory) and a front-end system altix (an SGI Altix 330 with 16 CPUs and 32 GB shared memory; 4 CPUs + 8 GB are used as login partition (boot cpuset) – the remaining ones are also used for batch processing).

An important thing to note is that the new machine has only half the amount of memory per CPU as the "old" one. As the cpusets introduced with SuSE SLES9/SGI ProPack4.x do not have all the features known from the old SGI Origin cpusets, in particular policy kill is missing, the systems starts swapping as soon as one process exceeds the amount of memory available in its cpuset. As a results, the complete system becomes un-responsive.

Therefore, it is very important to request the correct machine or to specify the amount of memory required in addition to the number of CPUs. Also the amount of interactive work is now much more limitated as we now have a login partition (boot cpuset) which only have access to 4 CPUs and 8 GB of memory!

Check the official web page of the SGI Altic Systems at RRZE for more details and the correct syntax for specifying recource requirements.