[mvapich-discuss] Core binding oversubscription with batch schedulers

Jonathan Perkins perkinjo at cse.ohio-state.edu
Mon Oct 14 15:06:15 EDT 2013


On Thu, Oct 10, 2013 at 02:40:01PM +0100, Mark Dixon wrote:
> On Thu, 12 Sep 2013, Jonathan Perkins wrote:
> >If so, you can activate this rsh replacement by setting the
> >environment variable RSH_CMD to the appropriate command at
> >configure time.
> 
> Yes; we used to do some something similar to that when we used mpirun_rsh.
> 
> However, we recently switched to using hydra: among other
> advantages, its support for gridengine means we no longer need to
> convert the hostfile to something mpirun_rsh understands.

I missed this in the earlier email.  Does this mean that the correct
behavior is seen with mpirun_rsh but not hydra?

> >This type of issue may affect a decent number of users so there is no
> >harm in discussing this here.  Thanks for your note.  I hope that we can
> >find an acceptable solution for your situation.
> ...
> 
> Thanks :)
> 
> Since the Oracle takeover of Sun and subsequent forking of the
> software, gridengine turned into a hard target to track. The
> description above essentially describes the situation pre-fork and
> probably describes the lowest common denominator today.
> 
> Some of the forks have since moved on to trying to use cpusets
> instead of libnuma's affinity routines to perform mandatory
> restrictions, but the situation is still in flux. I'm currently
> investigating these.

I'm not familiar with the current state of Grid Engine.  I do hope that
they are moving forward in this direction as it will make it easier for
lots of software to "just work".

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list