MpCCI on SGI Altix (part 2)
MpCCI is a [commercial] tool (basically a library) which allows coupling different numerical codes. MpCCI is only distributed in binary form and depends on a number of other tools, in particular on MPI. For the Itanium architecture right now only a mpich-based version is provided.
Using a standard MPICH (with p4 device) on SGI Altix is a rather bad idea as the ssh-based start mechanism does not respect CPUsets, proper clean-up is not guarateed, etc.
Due to the problems related to the ssh-based start mechanims of a standard ch4p MPICH, the corresponding mpirun has been removed on Sept. 14, 2005! This guarantees better stability of our SGI Altix system, however, requires some additional steps for users of MpCCI:
-  load as usual the module mpcci/3.0.3-ia64-glibc23ormpcci/3.0.3-ia64-glibc23-intel9(I hope both still work fine)
-  compile your code as usual (and as in the past) – MPICHHOMEandMPIROOTDIRare automatically set by the MpCCI module
-  create your MpCCI specific input files
-  interactively run ccirun -log -norun xxx.cci
-  edit the generated ccirun.procgroupfile:
-  now prepare your PBS job file; use the following line to start your program – it replaces the previous mpirunline!
/opt/MpCCI/mpiexec-0.80/bin/mpiexec -mpich-p4-no-shmem -config=ccirun.procgroup
 
-  and submit your job. The number of CPUs you request must be equal to (or larger than) the number of processes you start, i.e. you have to count the MpCCI controll process!
Some additional remarks:
-  it is not clear at the moment whether the runtime of such  jobs can be extended once they are submitted/running. We’ll probably habe to check this on a actual run …
- if your reads from STDIN you need an additional step to get it working again:
- if you have something like read(*,*) xorread *,xyou have to set the envirnoment variableFOR_READto the file which contains the input
- if you have something like read(5,*) xorread 5,xyou have to set the envirnoment variableFORT5to the file which contains the input
 
two additional remarks – ccirun.procgroup for mpiexec:
-  for some reason, it seems to be necessary to use only the short hostname (e.g. “altix“) instead of the fully qualified hostnamed (e.g.altix.rrze.uni-erlangen.de)
-  with some applications, the first line in the procgroup file must be “altix : some-path/ccirun.cci-control.0.spawn --mpcci-inputfile ...“, with other applications, this line must be omitted (and the option “--mpcci-inputfile ...” has to be passed to the first actual executable
additional MpCCI remarks: … as we now have two different SGI Altix systems in our batch system, you either have to explicitly request one host using -l host=altix or -l host=altix-batch or you have to dynamically generate the config file for mpiexec.
In addition, mpiexec has been upgraded to a newer version. Just use the /opt/MpCCI/mpiexec/bin/mpiexec to always get the latest version. -mpich-p4-no-shmem is nolonger necessary as it is compiled-in as default.