[mvapich-discuss] Possibly undesirable mvapich "feature" (was Possible bug)

Laurence Marks L-marks at northwestern.edu
Thu Sep 25 11:48:58 EDT 2008

I think I may have partially resolved my previous problem
), but not completely.

One of the engineers at the company that sold me the cluster pointed
out that the first node running the job (using 8 cores) was doing a
little swap, even though the mpi job itself was not requiring swap. I
suspect that doing I/O and general other OS tasks associated with
communicating from the 1st core to all the others was leading to this
and causing problems. I can resolve this by running with the first
entry in the machines file on my head node, then everything is OK.

Unfortunately this leads to another problem. If I have two mpi jobs
both using one core on the head node, instead of using separate cores
they both use the same one! I suspect that this is a design feature,
i.e. to use the first core unless something else has been specified
with VIADEV_CPU_MAPPING or similar. I wonder if there is any way
around this short of specifying different mappings for different jobs
which would become a bit of a nightmare since individual users (i.e.
my students) would have to get it right. An alternative is running
with 7 cores on the first machine to leave some free CPU for OS
operations, but this is inefficient.

Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR

More information about the mvapich-discuss mailing list