[mvapich-discuss] Default of MV2_ENABLE_AFFINITY: why 1?

Jonathan Perkins perkinjo at cse.ohio-state.edu
Tue Jun 19 07:26:49 EDT 2012


Yes, the resource manager would keep track of the resources available on
each node and allocate accordingly.

On Tue, Jun 19, 2012 at 12:48:20AM -0400, Stephen Cousins wrote:
> Hi Jonathan,
> 
> If Torque or Slurm was to do this it would need to create a potentially
> different task set for each node right? I just want to make sure we're on
> the same page.
> 
> Thanks,
> 
> Steve
> 
> On Mon, Jun 18, 2012 at 9:48 PM, Jonathan Perkins <
> perkinjo at cse.ohio-state.edu> wrote:
> 
> > On Mon, Jun 18, 2012 at 01:17:28PM -0400, Stephen Cousins wrote:
> > > Hi,
> > >
> > > I am revisiting some affinity issues. For jobs that run on all cores of a
> > > node I can see that having MV2_ENABLE_AFFINITY=1 is beneficial. However,
> > if
> > > you ever have users that run on a subset of the cores, where other jobs
> > > might get scheduled on this node too this is a big problem. It is fine
> > for
> > > the first job. The second job though uses the same cores as the first and
> > > performance goes down dramatically.  We are using Torque and Moab.
> > >
> > > I checked the mvapich2 configure script to see if there was a way to
> > > compile the code with a different default but I didn't find any way to do
> > > this. Rather than changing the code I have set the environment variable
> > in
> > > the module file for loading the MVAPICH2 environment.
> > >
> > > To my mind setting it to 0 as the default is a better choice since
> > > currently the consequence of it being set wrong is much more dramatically
> > > bad than the benefit of it being right. That is, currently if you get it
> > > wrong you see at least a 100% time penalty in your job, whereas if you
> > get
> > > it right (that is, you really do want affinity set) then you get maybe a
> > > 10% to 20% benefit.
> > >
> > > In general, I'd much rather have Affinity enabled but not the way it is
> > > currently implemented.
> > > How about if Affinity is enabled, then when new processes are started
> > make
> > > sure they are started on cores that aren't already being used, at least
> > not
> > > by other MVAPICH2 programs. Non-MVAPICH2 programs (at least the ones I'm
> > > seeing with CHARM or OpenMP, I'll have to check with OpenMPI jobs) the
> > > Linux scheduler seems to bounce them around appropriately scattering them
> > > amongst the free sockets/cores.
> > >
> > > I have seen in the list that a general answer to this problem is to use
> > CPU
> > > Mapping but unless this can be done automatically by Moab/Torque this
> > will
> > > not work. For one thing, each node that is assigned to the job may
> > require
> > > a different mapping depending on what else is running on the nodes.
> > >
> > > What do you think?
> >
> > Thank you for your note.  We are discussing the issue that you're
> > bringing up to see if we can do anything additional to address this
> > situation.
> >
> > It seems that it would be ideal for the job manager such as Torque or
> > SLURM to set the cpuset that the jobs can run on.  If this is configured
> > then this should not be an issue.
> >
> > --
> > Jonathan Perkins
> > http://www.cse.ohio-state.edu/~perkinjo
> >
> 
> 
> 
> -- 
> ______________________________________________________________________
> Steve Cousins - Supercomputer Engineer/Administrator - Univ of Maine
> Marine Sciences, 452 Aubert Hall       Target Tech, 20 Godfrey Drive
> Orono, ME 04469    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~     Orono, ME 04473
> (207) 581-4302     ~ steve.cousins at maine.edu ~     (207) 866-6552

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


More information about the mvapich-discuss mailing list