[mvapich-discuss] process binding with mvapich2-1.7rc2

Devendar Bureddy bureddy at cse.ohio-state.edu
Fri Oct 14 13:08:59 EDT 2011


Hi Dan

This behaviour is because of conflicting binding between launcher(hydra) and
mvapich2 library. you can get it correct in one of the following way

1)  turn off binding in MPI library (MV2_ENABLE_AFFINITY=0)

    #mvapich2-1.7rc2/install/bin/mpirun   -prepend-rank -launcher-exec
/usr/bin/sshmpi -binding user:1,7,3,9 -topolib hwloc -f $PBS_NODEFILE -env
MV2_ENABLE_AFFINITY 0 -np 256

2)  turn off hydra mapping and specify binding with MV2 library (
MV2_CPU_MAPPING=1:7:3:9)

#mvapich2-1.7rc2/install/bin/mpirun   -prepend-rank -launcher-exec
/usr/bin/sshmpi -f $PBS_NODEFILE -env  MV2_CPU_MAPPING=1:7:3:9  -np 256


Let us know if it works properly

-Devendar



On Fri, Oct 14, 2011 at 11:16 AM, Dan Kokron <daniel.kokron at nasa.gov> wrote:

> I am using mvapich2-1.7rc2 configure as follows
>
> ./configure CC=icc CXX=icpc F77=ifort FC=ifort CFLAGS=-fpic
> CXXFLAGS=-fpic FFLAGS=-fpic FCFLAGS=-fpic
> --prefix=/home/dkokron/play/mvapich2-1.7rc2/install --enable-f77
> --enable-fc --enable-cxx --enable-romio --enable-threads=default
> --with-device=ch3:mrail --with-rdma=gen2 --with-hwloc
> -disable-multi-aliases -enable-xrc=yes -enable-hybrid
>
> I specified a process binding of 1,7,3,9 on the command line as follows
>
> mvapich2-1.7rc2/install/bin/mpirun   -prepend-rank
> -launcher-exec /usr/bin/sshmpi -binding user:1,7,3,9 -topolib hwloc -f
> $PBS_NODEFILE -np 256
>
> 'top' shows a different binding (0,1,2,3)
>
>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   P COMMAND
>  9106 dkokron   20   0 1764m 586m  30m R  100  2.4   9:58.24  0 GSIsa.x
>  9107 dkokron   20   0 1692m 506m  18m R  100  2.1  10:05.06  1 GSIsa.x
>  9108 dkokron   20   0 1691m 506m  18m R  100  2.1  10:03.42  2 GSIsa.x
>  9109 dkokron   20   0 1692m 507m  18m R  100  2.1  10:04.35  3 GSIsa.x
>
> lstopo from hwloc-1.3 shows the following for each compute node
>
> Machine (24GB)
>  NUMANode L#0 (P#0 12GB) + Socket L#0 + L3 L#0 (12MB)
>    L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>    L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>    L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>    L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>    L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4)
>    L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
>  NUMANode L#1 (P#1 12GB) + Socket L#1 + L3 L#1 (12MB)
>    L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6)
>    L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
>    L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#8)
>    L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#9)
>    L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10 (P#10)
>    L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11 (P#11)
> --
> Dan Kokron
> Global Modeling and Assimilation Office, Code 610.1
> NASA Goddard Space Flight Center
> Greenbelt, MD 20771 USA
> Daniel.S.Kokron at nasa.gov
> Phone: (301) 614-5192
> Fax:   (301) 614-5304
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20111014/f6d6176e/attachment.html


More information about the mvapich-discuss mailing list