Rocks versus (OpenMPI versus MPICH2)

Last week I was running some benchmarks on the new cluster at $WORK; I was trying to see what effect compiling a new, Torque-aware version of OpenMPI would have. As you may remember, the stock version of OpenMPI that comes with Rocks is not Torque-aware, so a workaround was added that told OpenMPI which nodes Torque had allocated to it.

The change was in the Torque submission script. Stock version:

source /opt/torque/etc/openmpi-setup.sh
/opt/openmpi/bin/mpiexec -n $NUM_PROCS /usr/bin/emacs

New, Torque-aware version:

/path/to/new/openmpi/bin/mpiexec -n $NUM_PROCS /usr/bin/emacs

(Of course I benchmark the cluster using Emacs. Don't you have the code for the MPI-aware version?)

In the end, there wasn't a whole lot of difference in runtimes; that didn't surprise me too much, since (as I understand it) the difference between the two methods is mainly in the starting of jobs -- the overhead at the beginning, rather than in the running of the job itself.

For fun, I tried running the job with MPICH2, another MPI implementation:

/opt/mpich2/gnu/bin/mpiexec -n $NUM_PROCS /usr/bin/emacs

and found pretty terrible performance. It turned out that it wasn't running on all the nodes...in fact, it was only running on one node, and with as many processes as CPUs I'd specified. Since this was meant to be a 4-node, 8-CPU/node version, that meant 32 copies of Emacs on one node. Damn right it was slow.

So what the hell? First thought was that maybe this was a library-versus-launching mismatch. You compile MPI applications using the OpenMPI or MPICH2 versions of the Gnu compilers -- which are basically just wrappers around the regular tools that set library paths and such correctly. So if your application links to OpenMPI but you launch it with MPICH2, maybe that's the problem.

I still need to test that. However, I think what's more likely is that MPICH2 is not Torque-aware. The ever-excellent Debian Clusters has an excellent page on this, with a link to the other other other mpiexec page. Now I need to figure out if the Rocks people have changed anything since 2008, and if the Torque Roll documentation is incomplete or just misunderstood (by me).