[mvapich-discuss] Default of MV2_ENABLE_AFFINITY: why 1?

Stephen Cousins steve.cousins at maine.edu
Mon Jun 18 13:17:28 EDT 2012


I am revisiting some affinity issues. For jobs that run on all cores of a
node I can see that having MV2_ENABLE_AFFINITY=1 is beneficial. However, if
you ever have users that run on a subset of the cores, where other jobs
might get scheduled on this node too this is a big problem. It is fine for
the first job. The second job though uses the same cores as the first and
performance goes down dramatically.  We are using Torque and Moab.

I checked the mvapich2 configure script to see if there was a way to
compile the code with a different default but I didn't find any way to do
this. Rather than changing the code I have set the environment variable in
the module file for loading the MVAPICH2 environment.

To my mind setting it to 0 as the default is a better choice since
currently the consequence of it being set wrong is much more dramatically
bad than the benefit of it being right. That is, currently if you get it
wrong you see at least a 100% time penalty in your job, whereas if you get
it right (that is, you really do want affinity set) then you get maybe a
10% to 20% benefit.

In general, I'd much rather have Affinity enabled but not the way it is
currently implemented.
How about if Affinity is enabled, then when new processes are started make
sure they are started on cores that aren't already being used, at least not
by other MVAPICH2 programs. Non-MVAPICH2 programs (at least the ones I'm
seeing with CHARM or OpenMP, I'll have to check with OpenMPI jobs) the
Linux scheduler seems to bounce them around appropriately scattering them
amongst the free sockets/cores.

I have seen in the list that a general answer to this problem is to use CPU
Mapping but unless this can be done automatically by Moab/Torque this will
not work. For one thing, each node that is assigned to the job may require
a different mapping depending on what else is running on the nodes.

What do you think?


Steve Cousins - Supercomputer Engineer/Administrator - Univ of Maine
Marine Sciences, 452 Aubert Hall       Target Tech, 20 Godfrey Drive
Orono, ME 04469    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~     Orono, ME 04473
(207) 581-4302     ~ steve.cousins at maine.edu ~     (207) 866-6552
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120618/85ab46e9/attachment.html

More information about the mvapich-discuss mailing list