[mvapich-discuss] Possibly undesirable mvapich "feature" (was Possible bug)

Laurence Marks L-marks at northwestern.edu
Thu Sep 25 11:48:58 EDT 2008


I think I may have partially resolved my previous problem
(http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2008-September/001920.html
), but not completely.

One of the engineers at the company that sold me the cluster pointed
out that the first node running the job (using 8 cores) was doing a
little swap, even though the mpi job itself was not requiring swap. I
suspect that doing I/O and general other OS tasks associated with
communicating from the 1st core to all the others was leading to this
and causing problems. I can resolve this by running with the first
entry in the machines file on my head node, then everything is OK.

Unfortunately this leads to another problem. If I have two mpi jobs
both using one core on the head node, instead of using separate cores
they both use the same one! I suspect that this is a design feature,
i.e. to use the first core unless something else has been specified
with VIADEV_CPU_MAPPING or similar. I wonder if there is any way
around this short of specifying different mappings for different jobs
which would become a bit of a nightmare since individual users (i.e.
my students) would have to get it right. An alternative is running
with 7 cores on the first machine to leave some free CPU for OS
operations, but this is inefficient.

-- 
Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Web: www.numis.northwestern.edu
Chair, Commission on Electron Crystallography of IUCR
www.numis.northwestern.edu/IUCR_CED


More information about the mvapich-discuss mailing list