[mvapich-discuss] Problem with more MPI jobs on the same node

Dhabaleswar Panda panda at cse.ohio-state.edu
Sat Aug 29 11:39:22 EDT 2009


What is the output of top and mpstat when you run a 16-process LU job on
the same 16-cores (0-15)?

You also indicated in your original e-mail that a single node has 32
cores. I am assuming that it has eight sockets of four cores each. Are
these Opterons or any other processor type?

DK

On Sat, 29 Aug 2009, Emir Imamagic wrote:

> Dhabaleswar Panda wrote:
> > Which interface of mvapich 1.1.0 you are using - Gen2 or Gen2-hybrid? If
> > you are using `Gen2' interface, VIADEV_USE_AFFINITY=0 should be disabling
> > affinity. For Gen2-Hybrid, the variable is MV_USE_AFFINITY. Also, for Gen2
> > interface, there is a CPU mapping option VIADEV_CPU_MAPPING through which
> > you can actually run an MPI job on a specified set of cores. Can you try
> > this option to make sure that different MPI jobs can explicitly get mapped
> > to different cores.
>
> I'm using Gen2. And I tried with both
> - VIADEV_USE_AFFINITY=0 and
> - VIADEV_USE_AFFINITY=1, VIADEV_CPU_MAPPING:
> mpirun_rsh -ssh -np 8 -hostfile ./machines
> VIADEV_CPU_MAPPING=0,1,2,3,4,5,6,7 VIADEV_USE_AFFINITY=1 ./lu.C.8.mvapich
> mpirun_rsh -ssh -np 8 -hostfile ./machines
> VIADEV_CPU_MAPPING=8,9,10,11,12,13,14,15 VIADEV_USE_AFFINITY=1
> ./lu.C.8.mvapich
>
> Result was the same. Below is the output of top and mpstat when
> VIADEV_CPU_MAPPING was used.
>
> Cheers,
> emir
>
>    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> 30434 eimamagi  25   0  164m 117m  11m R 50.2  0.2   0:56.78 lu.C.8.mvapich
> 30435 eimamagi  25   0  164m 111m 5044 R 50.2  0.2   0:56.79 lu.C.8.mvapich
> 30436 eimamagi  25   0  164m 111m 5156 R 50.2  0.2   0:56.75 lu.C.8.mvapich
> 30437 eimamagi  25   0  164m 109m 3200 R 50.2  0.2   0:56.80 lu.C.8.mvapich
> 30440 eimamagi  25   0  164m 110m 4360 R 50.2  0.2   0:56.77 lu.C.8.mvapich
> 30441 eimamagi  25   0  164m 109m 3168 R 50.2  0.2   0:56.78 lu.C.8.mvapich
> 30692 eimamagi  25   0  164m 109m 3132 R 50.2  0.2   0:29.20 lu.C.8.mvapich
> 30693 eimamagi  25   0  164m 111m 5080 R 50.2  0.2   0:29.39 lu.C.8.mvapich
> 30438 eimamagi  25   0  164m 109m 3172 R 49.8  0.2   0:56.74 lu.C.8.mvapich
> 30439 eimamagi  25   0  164m 111m 5068 R 49.8  0.2   0:56.54 lu.C.8.mvapich
> 30688 eimamagi  25   0  164m 117m  11m R 49.8  0.2   0:29.15 lu.C.8.mvapich
> 30689 eimamagi  25   0  164m 110m 4152 R 49.8  0.2   0:29.15 lu.C.8.mvapich
> 30690 eimamagi  25   0  164m 111m 5092 R 49.8  0.2   0:29.15 lu.C.8.mvapich
> 30691 eimamagi  25   0  164m 109m 3360 R 49.8  0.2   0:29.15 lu.C.8.mvapich
> 30694 eimamagi  25   0  164m 110m 4568 R 49.8  0.2   0:29.16 lu.C.8.mvapich
> 30695 eimamagi  25   0  164m 109m 3116 R 49.8  0.2   0:29.15 lu.C.8.mvapich
>
> And mpstat -P ALL:
> $ mpstat -P ALL
> Linux 2.6.18-128.1.16.el5      08/29/2009
>
> 05:17:48 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal
>    %idle    intr/s
> 05:17:48 PM  all   15.18    0.00    1.39    0.01    0.00    0.00    0.00
>    83.42    133.77
> 05:17:48 PM    0   33.55    0.01    5.40    0.04    0.00    0.02    0.00
>    60.98    133.77
> 05:17:48 PM    1   43.95    0.00    5.43    0.00    0.00    0.00    0.00
>    50.61      0.00
> 05:17:48 PM    2   50.77    0.00    5.39    0.00    0.00    0.00    0.00
>    43.83      0.00
> 05:17:48 PM    3   49.93    0.00    5.46    0.00    0.00    0.00    0.00
>    44.60      0.00
> 05:17:48 PM    4   38.46    0.00    5.28    0.00    0.00    0.00    0.00
>    56.26      0.00
> 05:17:48 PM    5   33.77    0.00    5.29    0.00    0.00    0.00    0.00
>    60.93      0.00
> 05:17:48 PM    6   45.39    0.00    5.25    0.00    0.00    0.00    0.00
>    49.36      0.00
> 05:17:48 PM    7   35.32    0.00    5.30    0.00    0.00    0.00    0.00
>    59.38      0.00
> 05:17:48 PM    8   16.53    0.00    0.05    0.02    0.00    0.00    0.00
>    83.40      0.00
> 05:17:48 PM    9    5.67    0.00    0.05    0.03    0.00    0.00    0.00
>    94.25      0.00
> 05:17:48 PM   10    0.81    0.00    0.06    0.02    0.00    0.00    0.00
>    99.11      0.00
> 05:17:48 PM   11    0.81    0.00    0.07    0.02    0.00    0.00    0.00
>    99.09      0.00
> 05:17:48 PM   12   32.88    0.00    0.13    0.00    0.00    0.00    0.00
>    66.99      0.00
> 05:17:48 PM   13    0.94    0.00    0.06    0.00    0.00    0.00    0.00
>    99.00      0.00
> 05:17:48 PM   14    5.91    0.00    0.05    0.00    0.00    0.00    0.00
>    94.04      0.00
> 05:17:48 PM   15    0.82    0.00    0.10    0.00    0.00    0.00    0.00
>    99.08      0.00
> 05:17:48 PM   16   26.38    0.00    0.13    0.00    0.00    0.00    0.00
>    73.49      0.00
> 05:17:48 PM   17    1.83    0.00    0.04    0.00    0.00    0.00    0.00
>    98.13      0.00
> 05:17:48 PM   18    1.98    0.00    0.03    0.00    0.00    0.00    0.00
>    97.99      0.00
> 05:17:48 PM   19    0.80    0.01    0.24    0.00    0.00    0.00    0.00
>    98.95      0.00
> 05:17:48 PM   20   15.50    0.00    0.06    0.00    0.00    0.00    0.00
>    84.44      0.00
> 05:17:48 PM   21    3.34    0.00    0.07    0.00    0.00    0.00    0.00
>    96.58      0.00
> 05:17:48 PM   22    2.97    0.00    0.03    0.00    0.00    0.00    0.00
>    97.00      0.00
> 05:17:48 PM   23    2.35    0.00    0.13    0.00    0.00    0.00    0.00
>    97.52      0.00
> 05:17:48 PM   24   10.61    0.00    0.06    0.00    0.00    0.00    0.00
>    89.33      0.00
> 05:17:48 PM   25    4.52    0.01    0.24    0.02    0.00    0.00    0.00
>    95.21      0.00
> 05:17:48 PM   26    2.44    0.00    0.02    0.00    0.00    0.00    0.00
>    97.54      0.00
> 05:17:48 PM   27    1.54    0.00    0.02    0.00    0.00    0.00    0.00
>    98.43      0.00
> 05:17:48 PM   28    8.58    0.00    0.05    0.01    0.00    0.00    0.00
>    91.37      0.00
> 05:17:48 PM   29    4.75    0.00    0.02    0.01    0.00    0.00    0.00
>    95.23      0.00
> 05:17:48 PM   30    0.80    0.00    0.01    0.01    0.00    0.00    0.00
>    99.18      0.00
> 05:17:48 PM   31    1.99    0.00    0.01    0.01    0.00    0.00    0.00
>    97.99      0.00
>
>
>



More information about the mvapich-discuss mailing list