[mvapich-discuss] Non-MPI_THREAD-SINGLE mode with enabled MV2 affinity?

Thiago Quirino - NOAA Federal thiago.quirino at noaa.gov
Mon Nov 18 15:44:35 EST 2013


Hi, Jonathan.

I tried the syntax you've posted with the additional
"KMP_AFFINITY=disabled" flag (I use intel compiler and intel's affinity
seems capable of overriding mpiexec.hydra's affinity unless it is manually
disabled). Indeed, following the syntax, the 4 MPI processes in each node
were pinned as expected to cores 0, 4, 8,and 12. However, the 4 Pthreads
spawned by each MPI processes were also pinned to the same respective cores
(0, 4, 8 and 12). That is, for example, the 4 Pthreads spawned by the first
MPI process in each node all ended up running on core 0 rather than running
anywhere among cores 0-3. The same for the other remaining MPI processes.

My program continually spawns, runs, and delete 4 pthreads during each
iteration (e.g. integration timestep) of my main loop. It seems like the
Pthreads are being spawned and then pinned to only the first core in the
MPI process mask specified in the call to mpiexec.hydra.

Is this expected?

Thank you abundantly.
Thiago.



On Fri, Nov 15, 2013 at 10:31 AM, Jonathan Perkins <
perkinjo at cse.ohio-state.edu> wrote:

> Hello Thiago.  I've confirmed that the following syntax will give you
> what you're asking for.  You'll basically need to specify user
> binding, replace your commas with pluses, and replace your colons with
> commas.
>
> [perkinjo at sandy1 install]$ ./bin/mpiexec -n 4 -bind-to
> user:0+1+2+3,4+5+6+7,8+9+10+11,12+13+14+15 -env MV2_ENABLE_AFFINITY=0
> ./cpumask
> 1111000000000000
> 0000111100000000
> 0000000011110000
> 0000000000001111
>
> On Wed, Nov 13, 2013 at 7:02 PM, Thiago Quirino - NOAA Federal
> <thiago.quirino at noaa.gov> wrote:
> > Hi, Jonathan.
> >
> > Using mpiexec.hydra's binding capability, is it possible to assign a CPU
> > range for each MPI task in a node? Suppose I want to spawn 4 MPI tasks
> per
> > node, where each node has 2 sockets with 8 cores each (total 16 cores). I
> > want task 1 to run on CPU range 0-3, task 2 on CPU range 4-7, task 3 on
> CPU
> > range 8-11, and task 4 on CPU range 12-15. I used to accomplish this
> using
> > the MV2_CPU_BINDING variables as follows:
> >
> > export MV2_CPU_MAPPING=0,1,2,3:4,5,6,7:8,9,10,11:12,13,14,15
> >
> > Can I accomplish the same binding configuration with mpiexec.hydra's
> binding
> > capability? I only see socket binding options in the Wiki.
> >
> > Thanks again, Jonathan.
> > Thiago.
> >
> >
> >
> > On Tue, Nov 12, 2013 at 1:56 PM, Jonathan Perkins
> > <jonathan.lamar.perkins at gmail.com> wrote:
> >>
> >> Hello Thiago.  Perhaps you can try an alternative to
> >> MV2_ENABLE_AFFINITY.  If you use the hydra process manager
> >> (mpiexec.hydra), you can disable the library affinity and use the
> >> launcher affinity instead.  In this case the other threading levels
> >> will be available to you.
> >>
> >> Please see
> >>
> https://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Process-core_Binding
> >> for more information on how to use this hydra feature.  Also please do
> >> not forget to set MV2_ENABLE_AFFINITY to 0.
> >>
> >> Please let us know if this helps.
> >>
> >> On Fri, Nov 8, 2013 at 6:41 PM, Thiago Quirino - NOAA Federal
> >> <thiago.quirino at noaa.gov> wrote:
> >> > Hi, folks. Quick question about MVAPICH2 and affinity support.
> >> >
> >> > Is it possible to invoke MPI_Init_thread with any mode other than
> >> > "MPI_THREAD_SINGLE" and still use "MV2_ENABLE_AFFINITY=1"? In my
> hybrid
> >> > application I mix MPI with raw Pthreads (not OpenMP). I start 4 MPI
> >> > tasks in
> >> > each 16 cores node, where each node has 2 sockets with 8 Sandybridge
> >> > cores
> >> > each. Each of the 4 MPI tasks then spawns 4 pthreads for a total of 16
> >> > pthreads/node, or 1 pthread/core. Within each MPI task, the MPI calls
> >> > are
> >> > serialized among the 4 pthreads, so I can use any MPI_THREAD_* mode,
> but
> >> > I
> >> > don't know which mode will work best. I want to assign each of the 4
> MPI
> >> > tasks in a node a set of 4 cores using MV2_CPU_MAPPING (e.g. export
> >> > MV2_CPU_MAPPING=0,1,2,3:4,5,6,7:8,9,10,11:12,13,14,15) so that the 4
> >> > pthreads spawned by each MPI task can migrate to any processor within
> >> > its
> >> > exclusive CPU set of size 4.
> >> >
> >> > Is that possible with modes other than MPI_THREAD_SINGLE? If not, do
> you
> >> > foresee any issues with using MPI_THREAD_SINGLE while serializing the
> >> > MPI
> >> > calls among the 4 pthreads of each MPI task? That is, is there any
> >> > advantage
> >> > to using MPI_THREAD_FUNELLED or MPI_THREAD_SERIALIZED versus
> >> > MPI_THREAD_SINGLE for serialized calls among pthreads?
> >> >
> >> > Thank you so much, folks. Any help is much appreciated.
> >> >
> >> > Best,
> >> > Thiago.
> >> >
> >> >
> >> > ---------------------------------------------------
> >> > Thiago Quirino, Ph.D.
> >> > NOAA Hurricane Research Division
> >> > 4350 Rickenbacker Cswy.
> >> > Miami, FL 33139
> >> > P: 305-361-4503
> >> > E: Thiago.Quirino at noaa.gov
> >> >
> >> > _______________________________________________
> >> > mvapich-discuss mailing list
> >> > mvapich-discuss at cse.ohio-state.edu
> >> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >> >
> >>
> >>
> >>
> >> --
> >> Jonathan Perkins
> >
> >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
>
>
>
> --
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131118/aae6504b/attachment.html>


More information about the mvapich-discuss mailing list