[mvapich-discuss] (no subject)

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Dec 16 10:01:25 EST 2015


I believe that with a default installation of SLURM affinity is not handled
by SLURM.  If additional plugins are enabled then it may be better to just
go with that and disable MVAPICH2's affinity (MV2_ENABLE_AFFINITY=0).

Although the GDR build will work with host only applications, I would
suggest using two installations where the GDR version is used by
applications which take advantage of GPU transfers.

On Wed, Dec 16, 2015 at 9:48 AM Novosielski, Ryan <novosirj at ca.rutgers.edu>
wrote:

> Thanks Jonathan. This is mostly the case with us too, but I think affinity
> is also managed by SLURM even in those cases. Unless there is a reason
> MVAPICH2 would do a better job?
>
> Thanks for the information on MVAPICH2-GDR. Do I need a second copy of
> MVAPICH2 for that, or is it a superset of the regular MVAPICH2's features?
>
> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
> || \\UTGERS      |---------------------*O*---------------------
> ||_// Biomedical | Ryan Novosielski - Senior Technologist
> || \\ and Health | novosirj at rutgers.edu- 973/972.0922 (2x0922)
>
> ||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
>     `'
>
> On Dec 16, 2015, at 09:43, Jonathan Perkins <perkinjo at cse.ohio-state.edu>
> wrote:
>
> Hello Ryan:
>
> The CPU affinity feature of MVAPICH2 was designed with only a single job
> running on each node.  This is a more common case in HPC than allowing
> multiple jobs running on each node.  If you're trying to use SLURM to
> manage multiple jobs on each node it may be useful to explore cgroups as
> you've mentioned in your 4th question.
>
> Please note, for jobs using GPUs we recommend using the MVAPICH2-GDR
> library as it uses many new advanced features for better performance and
> scalability.
>
> You can find out more about it via:
> http://mvapich.cse.ohio-state.edu/overview/#mv2gdr
>
> You can download via:
> http://mvapich.cse.ohio-state.edu/downloads/#mv2gdr
>
> On Tue, Dec 15, 2015 at 1:27 PM Novosielski, Ryan <novosirj at ca.rutgers.edu>
> wrote:
>
>> Hi all,
>>
>> I'm using MVAPICH2 with SLURM's PMI2 interface. I'm therefore not using
>> mpirun/mpiexec at all. A user of mine is running some GPU jobs, which
>> require very small numbers of CPU's. So he's frequently not using the whole
>> node, and frequently running more than one job. MVAPICH2's affinity
>> stubbornly forces the jobs to bind to the same processors. The solution is
>> to turn affinity off.
>>
>> I have some questions about this:
>>
>> 1) Is there an imaginable scenario where, running with SLURM, I could
>> ever want this feature enabled? Should I somehow look at disabling it
>> system-wide or in the MVAPICH2 compile?
>> 2) If MVAPICH2 can't tell that a processor is already being used at 100%,
>> how can this feature ever work correctly? Just curious of the use case
>> under a different setting. Is it not meant to co-exist, two nodes on the
>> same job?
>> 3) I'd like this to be easy for the users. Should I just turn it off in
>> the module that is loaded for MVAPICH2 to prevent this from being an issue?
>> 4) Any thought to whether integrating cgroups to SLURM might solve the
>> problem (eg. SLURM won't even let MVAPICH2 see the other CPUs, so affinity
>> is a non-issue)?
>>
>> I'd welcome any other advice other sites have about this.
>>
>> --
>> ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
>>  || \\UTGERS      |---------------------*O*---------------------
>>  ||_// Biomedical | Ryan Novosielski - Senior Technologist
>>  || \\ and Health | novosirj at rutgers.edu - 973/972.0922 (2x0922)
>>  ||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
>>       `'
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151216/34591620/attachment.html>


More information about the mvapich-discuss mailing list