[mvapich-discuss] process binding with mvapich2-1.7rc2

Dan Kokron daniel.kokron at nasa.gov
Sat Oct 15 12:43:15 EDT 2011


Good call.  It's working now.

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   P COMMAND        
13186 dkokron   20   0 1711m 527m  20m R  100  2.2   6:47.66  3 GSIsa.x         
13187 dkokron   20   0 1711m 527m  21m R  100  2.2   6:48.50  9 GSIsa.x         
13184 dkokron   20   0 1777m 604m  30m R  100  2.5   6:42.16  1 GSIsa.x         
13185 dkokron   20   0 1715m 530m  19m R  100  2.2   6:48.33  7 GSIsa.x

Dan

On Fri, 2011-10-14 at 12:08 -0500, Devendar Bureddy wrote:
> Hi Dan
> 
> This behaviour is because of conflicting binding between
> launcher(hydra) and mvapich2 library. you can get it correct in one of
> the following way
> 
> 1)  turn off binding in MPI library (MV2_ENABLE_AFFINITY=0)
>   
>     #mvapich2-1.7rc2/install/bin/mpirun   -prepend-rank
> -launcher-exec /usr/bin/sshmpi -binding user:1,7,3,9 -topolib hwloc -f
> $PBS_NODEFILE -env MV2_ENABLE_AFFINITY 0 -np 256
> 
> 2)  turn off hydra mapping and specify binding with MV2 library
> ( MV2_CPU_MAPPING=1:7:3:9)
> 
> #mvapich2-1.7rc2/install/bin/mpirun   -prepend-rank
> -launcher-exec /usr/bin/sshmpi -f $PBS_NODEFILE -env
> MV2_CPU_MAPPING=1:7:3:9  -np 256
> 
> 
> Let us know if it works properly
> 
> -Devendar
> 
> 
> 
> On Fri, Oct 14, 2011 at 11:16 AM, Dan Kokron <daniel.kokron at nasa.gov>
> wrote:
>         I am using mvapich2-1.7rc2 configure as follows
>         
>         ./configure CC=icc CXX=icpc F77=ifort FC=ifort CFLAGS=-fpic
>         CXXFLAGS=-fpic FFLAGS=-fpic FCFLAGS=-fpic
>         --prefix=/home/dkokron/play/mvapich2-1.7rc2/install
>         --enable-f77
>         --enable-fc --enable-cxx --enable-romio
>         --enable-threads=default
>         --with-device=ch3:mrail --with-rdma=gen2 --with-hwloc
>         -disable-multi-aliases -enable-xrc=yes -enable-hybrid
>         
>         I specified a process binding of 1,7,3,9 on the command line
>         as follows
>         
>         mvapich2-1.7rc2/install/bin/mpirun   -prepend-rank
>         -launcher-exec /usr/bin/sshmpi -binding user:1,7,3,9 -topolib
>         hwloc -f
>         $PBS_NODEFILE -np 256
>         
>         'top' shows a different binding (0,1,2,3)
>         
>          PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+   P
>         COMMAND
>          9106 dkokron   20   0 1764m 586m  30m R  100  2.4   9:58.24
>          0 GSIsa.x
>          9107 dkokron   20   0 1692m 506m  18m R  100  2.1  10:05.06
>          1 GSIsa.x
>          9108 dkokron   20   0 1691m 506m  18m R  100  2.1  10:03.42
>          2 GSIsa.x
>          9109 dkokron   20   0 1692m 507m  18m R  100  2.1  10:04.35
>          3 GSIsa.x
>         
>         lstopo from hwloc-1.3 shows the following for each compute
>         node
>         
>         Machine (24GB)
>          NUMANode L#0 (P#0 12GB) + Socket L#0 + L3 L#0 (12MB)
>            L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0)
>            L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 + PU L#1 (P#1)
>            L2 L#2 (256KB) + L1 L#2 (32KB) + Core L#2 + PU L#2 (P#2)
>            L2 L#3 (256KB) + L1 L#3 (32KB) + Core L#3 + PU L#3 (P#3)
>            L2 L#4 (256KB) + L1 L#4 (32KB) + Core L#4 + PU L#4 (P#4)
>            L2 L#5 (256KB) + L1 L#5 (32KB) + Core L#5 + PU L#5 (P#5)
>          NUMANode L#1 (P#1 12GB) + Socket L#1 + L3 L#1 (12MB)
>            L2 L#6 (256KB) + L1 L#6 (32KB) + Core L#6 + PU L#6 (P#6)
>            L2 L#7 (256KB) + L1 L#7 (32KB) + Core L#7 + PU L#7 (P#7)
>            L2 L#8 (256KB) + L1 L#8 (32KB) + Core L#8 + PU L#8 (P#8)
>            L2 L#9 (256KB) + L1 L#9 (32KB) + Core L#9 + PU L#9 (P#9)
>            L2 L#10 (256KB) + L1 L#10 (32KB) + Core L#10 + PU L#10
>         (P#10)
>            L2 L#11 (256KB) + L1 L#11 (32KB) + Core L#11 + PU L#11
>         (P#11)
>         --
>         Dan Kokron
>         Global Modeling and Assimilation Office, Code 610.1
>         NASA Goddard Space Flight Center
>         Greenbelt, MD 20771 USA
>         Daniel.S.Kokron at nasa.gov
>         Phone: (301) 614-5192
>         Fax:   (301) 614-5304
>         
>         _______________________________________________
>         mvapich-discuss mailing list
>         mvapich-discuss at cse.ohio-state.edu
>         http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 
-- 
Dan Kokron
Global Modeling and Assimilation Office, Code 610.1
NASA Goddard Space Flight Center
Greenbelt, MD 20771 USA
Daniel.S.Kokron at nasa.gov
Phone: (301) 614-5192
Fax:   (301) 614-5304



More information about the mvapich-discuss mailing list