[mvapich-discuss] process binding with mvapich2-1.7rc2
Dan Kokron
daniel.kokron at nasa.gov
Sat Oct 15 12:43:15 EDT 2011
Good call. It's working now.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P COMMAND
13186 dkokron 20 0 1711m 527m 20m R 100 2.2 6:47.66 3 GSIsa.x
13187 dkokron 20 0 1711m 527m 21m R 100 2.2 6:48.50 9 GSIsa.x
13184 dkokron 20 0 1777m 604m 30m R 100 2.5 6:42.16 1 GSIsa.x
13185 dkokron 20 0 1715m 530m 19m R 100 2.2 6:48.33 7 GSIsa.x
Dan
On Fri, 2011-10-14 at 12:08 -0500, Devendar Bureddy wrote:
> Hi Dan
>
> This behaviour is because of conflicting binding between
> launcher(hydra) and mvapich2 library. you can get it correct in one of
> the following way
>
> 1) turn off binding in MPI library (MV2_ENABLE_AFFINITY=0)
>
> #mvapich2-1.7rc2/install/bin/mpirun -prepend-rank
> -launcher-exec /usr/bin/sshmpi -binding user:1,7,3,9 -topolib hwloc -f
> $PBS_NODEFILE -env MV2_ENABLE_AFFINITY 0 -np 256
>
> 2) turn off hydra mapping and specify binding with MV2 library
> ( MV2_CPU_MAPPING=1:7:3:9)
>
> #mvapich2-1.7rc2/install/bin/mpirun -prepend-rank
> -launcher-exec /usr/bin/sshmpi -f $PBS_NODEFILE -env
> MV2_CPU_MAPPING=1:7:3:9 -np 256
>
>
> Let us know if it works properly
>
> -Devendar
>
>
>
> On Fri, Oct 14, 2011 at 11:16 AM, Dan Kokron <daniel.kokron at nasa.gov>
> wrote:
> I am using mvapich2-1.7rc2 configure as follows
>
> ./configure CC=icc CXX=icpc F77=ifort FC=ifort CFLAGS=-fpic
> CXXFLAGS=-fpic FFLAGS=-fpic FCFLAGS=-fpic
> --prefix=/home/dkokron/play/mvapich2-1.7rc2/install
> --enable-f77
> --enable-fc --enable-cxx --enable-romio
> --enable-threads=default
> --with-device=ch3:mrail --with-rdma=gen2 --with-hwloc
> -disable-multi-aliases -enable-xrc=yes -enable-hybrid
>
> I specified a process binding of 1,7,3,9 on the command line
> as follows
>
> mvapich2-1.7rc2/install/bin/mpirun -prepend-rank
> -launcher-exec /usr/bin/sshmpi -binding user:1,7,3,9 -topolib
> hwloc -f
> $PBS_NODEFILE -np 256
>
> 'top' shows a different binding (0,1,2,3)
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ P
> COMMAND
> 9106 dkokron 20 0 1764m 586m 30m R 100 2.4 9:58.24
> 0 GSIsa.x
> 9107 dkokron 20 0 1692m 506m 18m R 100 2.1 10:05.06
> 1 GSIsa.x
> 9108 dkokron 20 0 1691m 506m 18m R 100 2.1 10:03.42
> 2 GSIsa.x
> 9109 dkokron 20 0 1692m 507m 18m R 100 2.1 10:04.35
> 3 GSIsa.x
>
> lstopo from hwloc-1.3 shows the following for each compute
> node
>
> Machine (24GB)
> NUMANode L#0 (P#0 12GB) + Socket L#0 + L3 L#0 (12MB)
> L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
> L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1)
> L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
> L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3)
> L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4)
> L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
> NUMANode L#1 (P#1 12GB) + Socket L#1 + L3 L#1 (12MB)
> L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6)
> L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
> L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#8)
> L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#9)
> L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10
> (P#10)
> L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11
> (P#11)
> --
> Dan Kokron
> Global Modeling and Assimilation Office, Code 610.1
> NASA Goddard Space Flight Center
> Greenbelt, MD 20771 USA
> Daniel.S.Kokron at nasa.gov
> Phone: (301) 614-5192
> Fax: (301) 614-5304
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
--
Dan Kokron
Global Modeling and Assimilation Office, Code 610.1
NASA Goddard Space Flight Center
Greenbelt, MD 20771 USA
Daniel.S.Kokron at nasa.gov
Phone: (301) 614-5192
Fax: (301) 614-5304
More information about the mvapich-discuss
mailing list