[mvapich-discuss] (no subject)

Novosielski, Ryan novosirj at ca.rutgers.edu
Wed Dec 16 10:11:44 EST 2015


Indeed, and this is what the users do when they want that. But some of them don't. Again, the most common example is when someone wants to use GPUs. It is common that you can't use more than one or two of them effectively with some software, making a pretty common to run more than one job. This doesn't have a negative impact on performance with affinity working properly.

____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
|| \\UTGERS      |---------------------*O*---------------------
||_// Biomedical | Ryan Novosielski - Senior Technologist
|| \\ and Health | novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>- 973/972.0922 (2x0922)
||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
    `'

On Dec 16, 2015, at 09:50, John Donners <john.donners at surfsara.nl<mailto:john.donners at surfsara.nl>> wrote:

Hello Ryan,

have you tried to use srun with the --exclusive option?
The man page reads:

'This  option  can also be used when initiating more than one job step
within an existing resource allocation, where you want separate
processors to be dedicated to each job step.'

Cheers,
John

On 16-12-15 15:43, Jonathan Perkins wrote:
Hello Ryan:

The CPU affinity feature of MVAPICH2 was designed with only a single
job running on each node.  This is a more common case in HPC than
allowing multiple jobs running on each node.  If you're trying to use
SLURM to manage multiple jobs on each node it may be useful to explore
cgroups as you've mentioned in your 4th question.

Please note, for jobs using GPUs we recommend using the MVAPICH2-GDR
library as it uses many new advanced features for better performance
and scalability.

You can find out more about it via:
http://mvapich.cse.ohio-state.edu/overview/#mv2gdr

You can download via:
http://mvapich.cse.ohio-state.edu/downloads/#mv2gdr

On Tue, Dec 15, 2015 at 1:27 PM Novosielski, Ryan
<novosirj at ca.rutgers.edu<mailto:novosirj at ca.rutgers.edu> <mailto:novosirj at ca.rutgers.edu>> wrote:

   Hi all,

   I'm using MVAPICH2 with SLURM's PMI2 interface. I'm therefore not
   using mpirun/mpiexec at all. A user of mine is running some GPU
   jobs, which require very small numbers of CPU's. So he's
   frequently not using the whole node, and frequently running more
   than one job. MVAPICH2's affinity stubbornly forces the jobs to
   bind to the same processors. The solution is to turn affinity off.

   I have some questions about this:

   1) Is there an imaginable scenario where, running with SLURM, I
   could ever want this feature enabled? Should I somehow look at
   disabling it system-wide or in the MVAPICH2 compile?
   2) If MVAPICH2 can't tell that a processor is already being used
   at 100%, how can this feature ever work correctly? Just curious of
   the use case under a different setting. Is it not meant to
   co-exist, two nodes on the same job?
   3) I'd like this to be easy for the users. Should I just turn it
   off in the module that is loaded for MVAPICH2 to prevent this from
   being an issue?
   4) Any thought to whether integrating cgroups to SLURM might solve
   the problem (eg. SLURM won't even let MVAPICH2 see the other CPUs,
   so affinity is a non-issue)?

   I'd welcome any other advice other sites have about this.

   --
   ____ *Note: UMDNJ is now Rutgers-Biomedical and Health Sciences*
    || \\UTGERS |---------------------*O*---------------------
    ||_// Biomedical | Ryan Novosielski - Senior Technologist
    || \\ and Health | novosirj at rutgers.edu<mailto:novosirj at rutgers.edu>
   <mailto:novosirj at rutgers.edu> - 973/972.0922 (2x0922)
    ||  \\  Sciences | OIRT/High Perf & Res Comp - MSB C630, Newark
         `'

   _______________________________________________
   mvapich-discuss mailing list
   mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>
   <mailto:mvapich-discuss at cse.ohio-state.edu>
   http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss



_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>
http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


--
SURFdrive: de persoonlijke cloudopslagdienst voor het Nederlandse hoger onderwijs en onderzoek.

| John Donners | Senior adviseur | Operations, Support & Development | SURFsara | Science Park 140 | 1098 XG Amsterdam | Nederland |
T (31)6 19039023 | john.donners at surfsara.nl<mailto:john.donners at surfsara.nl> | www.surfsara.nl<http://www.surfsara.nl> |

Aanwezig op | ma | di | wo | do | vr


--------------080208040801090201000502
Content-Type: text/html; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable

<html>
 <head>
   <meta content=3D"text/html; charset=3Dwindows-1252"
     http-equiv=3D"Content-Type">
 </head>
 <body bgcolor=3D"#FFFFFF" text=3D"#000000">
   <div class=3D"moz-cite-prefix">Hello Ryan,<br>
     <br>
     have you tried to use srun with the --exclusive option?<br>
     The man page reads:<br>
     <br>
     'This=A0 option=A0 can also be used when initiating more than one j=
ob
     step within an existing resource allocation, where you want
     separate processors to be dedicated to each job step.'<br>
     <br>
     Cheers,<br>
     John<br>
     <br>
     On 16-12-15 15:43, Jonathan Perkins wrote:<br>
   </div>
   <blockquote
cite=3D"mid:CAJdHTTZ8+v2HnOqF1DvNmgorxhKUsjDkfYhXBnMdcS7wAF-aug at mail.gmai<mailto:v2HnOqF1DvNmgorxhKUsjDkfYhXBnMdcS7wAF-aug at mail.gmai>=
l.com<http://l.com>"
     type=3D"cite">
     <div dir=3D"ltr"><span style=3D"font-size:13px;line-height:19.5px">=
Hello
         Ryan:</span>
       <div style=3D"font-size:13px;line-height:19.5px"><br>
       </div>
       <div style=3D"font-size:13px;line-height:19.5px">The CPU affinity
         feature of MVAPICH2 was designed with only a single job
         running on each node.=A0 This is a more common case in HPC than
         allowing multiple jobs running on each node.=A0 If you're tryin=
g
         to use SLURM to manage multiple jobs on each node it may be
         useful to explore cgroups as you've mentioned in your 4th
         question.</div>
       <div style=3D"font-size:13px;line-height:19.5px"><br>
       </div>
       <div style=3D"font-size:13px;line-height:19.5px">Please note, for
         jobs using GPUs we recommend using the MVAPICH2-GDR library as
         it uses many new advanced features for better performance and
         scalability.=A0</div>
       <div><br>
       </div>
       You can find out more about it via:
       <div><a moz-do-not-send=3D"true"
           href=3D"http://mvapich.cse.ohio-state.edu/overview/#mv2gdr">h=
ttp://mvapich.cse.ohio-state.edu/overview/#mv2gdr</a></div>
       <div><br>
       </div>
       <div>You can download via:</div>
       <div><a moz-do-not-send=3D"true"
           href=3D"http://mvapich.cse.ohio-state.edu/downloads/#mv2gdr">=
http://mvapich.cse.ohio-state.edu/downloads/#mv2gdr</a><br>
       </div>
       <div><br>
       </div>
       <div>
         <div class=3D"gmail_quote">
           <div dir=3D"ltr">On Tue, Dec 15, 2015 at 1:27 PM Novosielski,
             Ryan <<a moz-do-not-send=3D"true"
               href=3D"mailto:novosirj at ca.rutgers.edu">novosirj at ca.rutge<mailto:novosirj at ca.rutge>=
rs.edu<http://rs.edu></a>>
             wrote:<br>
           </div>
           <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0
             .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi all,<b=
r>
             <br>
             I'm using MVAPICH2 with SLURM's PMI2 interface. I'm
             therefore not using mpirun/mpiexec at all. A user of mine
             is running some GPU jobs, which require very small numbers
             of CPU's. So he's frequently not using the whole node, and
             frequently running more than one job. MVAPICH2's affinity
             stubbornly forces the jobs to bind to the same processors.
             The solution is to turn affinity off.<br>
             <br>
             I have some questions about this:<br>
             <br>
             1) Is there an imaginable scenario where, running with
             SLURM, I could ever want this feature enabled? Should I
             somehow look at disabling it system-wide or in the
             MVAPICH2 compile?<br>
             2) If MVAPICH2 can't tell that a processor is already
             being used at 100%, how can this feature ever work
             correctly? Just curious of the use case under a different
             setting. Is it not meant to co-exist, two nodes on the
             same job?<br>
             3) I'd like this to be easy for the users. Should I just
             turn it off in the module that is loaded for MVAPICH2 to
             prevent this from being an issue?<br>
             4) Any thought to whether integrating cgroups to SLURM
             might solve the problem (eg. SLURM won't even let MVAPICH2
             see the other CPUs, so affinity is a non-issue)?<br>
             <br>
             I'd welcome any other advice other sites have about this.<b=
r>
             <br>
             --<br>
             ____ *Note: UMDNJ is now Rutgers-Biomedical and Health
             Sciences*<br>
             =A0|| \\UTGERS=A0 =A0 =A0
             |---------------------*O*---------------------<br>
             =A0||_// Biomedical | Ryan Novosielski - Senior Technologis=
t<br>
             =A0|| \\ and Health | <a moz-do-not-send=3D"true"
               href=3D"mailto:novosirj at rutgers.edu" target=3D"_blank">no=
vosirj at rutgers.edu<mailto:vosirj at rutgers.edu></a>
             - 973/972.0922 (2x0922)<br>
             =A0||=A0 \\=A0 Sciences | OIRT/High Perf & Res Comp - M=
SB
             C630, Newark<br>
             =A0 =A0 =A0 `'<br>
             <br>
             _______________________________________________<br>
             mvapich-discuss mailing list<br>
             <a moz-do-not-send=3D"true"
               href=3D"mailto:mvapich-discuss at cse.ohio-state.edu"
               target=3D"_blank">mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu></a><=
br>
             <a moz-do-not-send=3D"true"
href=3D"http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discus=
s"
               rel=3D"noreferrer" target=3D"_blank">http://mailman.cse.o=
hio-state.edu/mailman/listinfo/mvapich-discuss<http://hio-state.edu/mailman/listinfo/mvapich-discuss></a><br>
           </blockquote>
         </div>
       </div>
     </div>
     <br>
     <fieldset class=3D"mimeAttachmentHeader"></fieldset>
     <br>
     <pre wrap=3D"">_______________________________________________
mvapich-discuss mailing list
<a class=3D"moz-txt-link-abbreviated" href=3D"mailto:mvapich-discuss at cse.=
ohio-state.edu<http://ohio-state.edu>">mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu></a>
<a class=3D"moz-txt-link-freetext" href=3D"http://mailman.cse.ohio-state.=
edu/mailman/listinfo/mvapich-discuss">http://mailman.cse.ohio-state.edu/m=
ailman/listinfo/mvapich-discuss</a>
</pre>
   </blockquote>
   <br>
   <br>
   <pre class=3D"moz-signature" cols=3D"72">--=20
SURFdrive: de persoonlijke cloudopslagdienst voor het Nederlandse hoger o=
nderwijs en onderzoek.

| John Donners | Senior adviseur | Operations, Support & Development =
| SURFsara | Science Park 140 | 1098 XG Amsterdam | Nederland |
T (31)6 19039023 | <a class=3D"moz-txt-link-abbreviated" href=3D"mailto:j=
ohn.donners at surfsara.nl<mailto:ohn.donners at surfsara.nl>">john.donners at surfsara.nl<mailto:john.donners at surfsara.nl></a> | <a class=3D"moz-t=
xt-link-abbreviated" href=3D"http://www.surfsara.nl">www.surfsara.nl<http://www.surfsara.nl></a> =
|

Aanwezig op | ma | di | wo | do | vr
</pre>
 </body>
</html>

--------------080208040801090201000502--

--===============7971192428631966613==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>
http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

--===============7971192428631966613==--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151216/111effb8/attachment-0001.html>


More information about the mvapich-discuss mailing list