[mvapich-discuss] process binding with mvapich2-1.7rc2
Devendar Bureddy
bureddy at cse.ohio-state.edu
Fri Oct 14 13:08:59 EDT 2011
Hi Dan
This behaviour is because of conflicting binding between launcher(hydra) and
mvapich2 library. you can get it correct in one of the following way
1) turn off binding in MPI library (MV2_ENABLE_AFFINITY=0)
#mvapich2-1.7rc2/install/bin/mpirun -prepend-rank -launcher-exec
/usr/bin/sshmpi -binding user:1,7,3,9 -topolib hwloc -f $PBS_NODEFILE -env
MV2_ENABLE_AFFINITY 0 -np 256
2) turn off hydra mapping and specify binding with MV2 library (
MV2_CPU_MAPPING=1:7:3:9)
#mvapich2-1.7rc2/install/bin/mpirun -prepend-rank -launcher-exec
/usr/bin/sshmpi -f $PBS_NODEFILE -env MV2_CPU_MAPPING=1:7:3:9 -np 256
Let us know if it works properly
-Devendar
On Fri, Oct 14, 2011 at 11:16 AM, Dan Kokron <daniel.kokron at nasa.gov> wrote:
> I am using mvapich2-1.7rc2 configure as follows
>
> ./configure CC=icc CXX=icpc F77=ifort FC=ifort CFLAGS=-fpic
> CXXFLAGS=-fpic FFLAGS=-fpic FCFLAGS=-fpic
> --prefix=/home/dkokron/play/mvapich2-1.7rc2/install --enable-f77
> --enable-fc --enable-cxx --enable-romio --enable-threads=default
> --with-device=ch3:mrail --with-rdma=gen2 --with-hwloc
> -disable-multi-aliases -enable-xrc=yes -enable-hybrid
>
> I specified a process binding of 1,7,3,9 on the command line as follows
>
> mvapich2-1.7rc2/install/bin/mpirun -prepend-rank
> -launcher-exec /usr/bin/sshmpi -binding user:1,7,3,9 -topolib hwloc -f
> $PBS_NODEFILE -np 256
>
> 'top' shows a different binding (0,1,2,3)
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
> 9106 dkokron 20 0 1764m 586m 30m R 100 2.4 9:58.24 0 GSIsa.x
> 9107 dkokron 20 0 1692m 506m 18m R 100 2.1 10:05.06 1 GSIsa.x
> 9108 dkokron 20 0 1691m 506m 18m R 100 2.1 10:03.42 2 GSIsa.x
> 9109 dkokron 20 0 1692m 507m 18m R 100 2.1 10:04.35 3 GSIsa.x
>
> lstopo from hwloc-1.3 shows the following for each compute node
>
> Machine (24GB)
> NUMANode L#0 (P#0 12GB) + Socket L#0 + L3 L#0 (12MB)
> L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
> L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1)
> L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
> L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3)
> L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4)
> L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
> NUMANode L#1 (P#1 12GB) + Socket L#1 + L3 L#1 (12MB)
> L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6)
> L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
> L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#8)
> L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#9)
> L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10 (P#10)
> L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11 (P#11)
> --
> Dan Kokron
> Global Modeling and Assimilation Office, Code 610.1
> NASA Goddard Space Flight Center
> Greenbelt, MD 20771 USA
> Daniel.S.Kokron at nasa.gov
> Phone: (301) 614-5192
> Fax: (301) 614-5304
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20111014/f6d6176e/attachment.html
More information about the mvapich-discuss
mailing list