[mvapich-discuss] Problem with more MPI jobs on the same node
Dhabaleswar Panda
panda at cse.ohio-state.edu
Sat Aug 29 11:39:22 EDT 2009
What is the output of top and mpstat when you run a 16-process LU job on
the same 16-cores (0-15)?
You also indicated in your original e-mail that a single node has 32
cores. I am assuming that it has eight sockets of four cores each. Are
these Opterons or any other processor type?
DK
On Sat, 29 Aug 2009, Emir Imamagic wrote:
> Dhabaleswar Panda wrote:
> > Which interface of mvapich 1.1.0 you are using - Gen2 or Gen2-hybrid? If
> > you are using `Gen2' interface, VIADEV_USE_AFFINITY=0 should be disabling
> > affinity. For Gen2-Hybrid, the variable is MV_USE_AFFINITY. Also, for Gen2
> > interface, there is a CPU mapping option VIADEV_CPU_MAPPING through which
> > you can actually run an MPI job on a specified set of cores. Can you try
> > this option to make sure that different MPI jobs can explicitly get mapped
> > to different cores.
>
> I'm using Gen2. And I tried with both
> - VIADEV_USE_AFFINITY=0 and
> - VIADEV_USE_AFFINITY=1, VIADEV_CPU_MAPPING:
> mpirun_rsh -ssh -np 8 -hostfile ./machines
> VIADEV_CPU_MAPPING=0,1,2,3,4,5,6,7 VIADEV_USE_AFFINITY=1 ./lu.C.8.mvapich
> mpirun_rsh -ssh -np 8 -hostfile ./machines
> VIADEV_CPU_MAPPING=8,9,10,11,12,13,14,15 VIADEV_USE_AFFINITY=1
> ./lu.C.8.mvapich
>
> Result was the same. Below is the output of top and mpstat when
> VIADEV_CPU_MAPPING was used.
>
> Cheers,
> emir
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 30434 eimamagi 25 0 164m 117m 11m R 50.2 0.2 0:56.78 lu.C.8.mvapich
> 30435 eimamagi 25 0 164m 111m 5044 R 50.2 0.2 0:56.79 lu.C.8.mvapich
> 30436 eimamagi 25 0 164m 111m 5156 R 50.2 0.2 0:56.75 lu.C.8.mvapich
> 30437 eimamagi 25 0 164m 109m 3200 R 50.2 0.2 0:56.80 lu.C.8.mvapich
> 30440 eimamagi 25 0 164m 110m 4360 R 50.2 0.2 0:56.77 lu.C.8.mvapich
> 30441 eimamagi 25 0 164m 109m 3168 R 50.2 0.2 0:56.78 lu.C.8.mvapich
> 30692 eimamagi 25 0 164m 109m 3132 R 50.2 0.2 0:29.20 lu.C.8.mvapich
> 30693 eimamagi 25 0 164m 111m 5080 R 50.2 0.2 0:29.39 lu.C.8.mvapich
> 30438 eimamagi 25 0 164m 109m 3172 R 49.8 0.2 0:56.74 lu.C.8.mvapich
> 30439 eimamagi 25 0 164m 111m 5068 R 49.8 0.2 0:56.54 lu.C.8.mvapich
> 30688 eimamagi 25 0 164m 117m 11m R 49.8 0.2 0:29.15 lu.C.8.mvapich
> 30689 eimamagi 25 0 164m 110m 4152 R 49.8 0.2 0:29.15 lu.C.8.mvapich
> 30690 eimamagi 25 0 164m 111m 5092 R 49.8 0.2 0:29.15 lu.C.8.mvapich
> 30691 eimamagi 25 0 164m 109m 3360 R 49.8 0.2 0:29.15 lu.C.8.mvapich
> 30694 eimamagi 25 0 164m 110m 4568 R 49.8 0.2 0:29.16 lu.C.8.mvapich
> 30695 eimamagi 25 0 164m 109m 3116 R 49.8 0.2 0:29.15 lu.C.8.mvapich
>
> And mpstat -P ALL:
> $ mpstat -P ALL
> Linux 2.6.18-128.1.16.el5 08/29/2009
>
> 05:17:48 PM CPU %user %nice %sys %iowait %irq %soft %steal
> %idle intr/s
> 05:17:48 PM all 15.18 0.00 1.39 0.01 0.00 0.00 0.00
> 83.42 133.77
> 05:17:48 PM 0 33.55 0.01 5.40 0.04 0.00 0.02 0.00
> 60.98 133.77
> 05:17:48 PM 1 43.95 0.00 5.43 0.00 0.00 0.00 0.00
> 50.61 0.00
> 05:17:48 PM 2 50.77 0.00 5.39 0.00 0.00 0.00 0.00
> 43.83 0.00
> 05:17:48 PM 3 49.93 0.00 5.46 0.00 0.00 0.00 0.00
> 44.60 0.00
> 05:17:48 PM 4 38.46 0.00 5.28 0.00 0.00 0.00 0.00
> 56.26 0.00
> 05:17:48 PM 5 33.77 0.00 5.29 0.00 0.00 0.00 0.00
> 60.93 0.00
> 05:17:48 PM 6 45.39 0.00 5.25 0.00 0.00 0.00 0.00
> 49.36 0.00
> 05:17:48 PM 7 35.32 0.00 5.30 0.00 0.00 0.00 0.00
> 59.38 0.00
> 05:17:48 PM 8 16.53 0.00 0.05 0.02 0.00 0.00 0.00
> 83.40 0.00
> 05:17:48 PM 9 5.67 0.00 0.05 0.03 0.00 0.00 0.00
> 94.25 0.00
> 05:17:48 PM 10 0.81 0.00 0.06 0.02 0.00 0.00 0.00
> 99.11 0.00
> 05:17:48 PM 11 0.81 0.00 0.07 0.02 0.00 0.00 0.00
> 99.09 0.00
> 05:17:48 PM 12 32.88 0.00 0.13 0.00 0.00 0.00 0.00
> 66.99 0.00
> 05:17:48 PM 13 0.94 0.00 0.06 0.00 0.00 0.00 0.00
> 99.00 0.00
> 05:17:48 PM 14 5.91 0.00 0.05 0.00 0.00 0.00 0.00
> 94.04 0.00
> 05:17:48 PM 15 0.82 0.00 0.10 0.00 0.00 0.00 0.00
> 99.08 0.00
> 05:17:48 PM 16 26.38 0.00 0.13 0.00 0.00 0.00 0.00
> 73.49 0.00
> 05:17:48 PM 17 1.83 0.00 0.04 0.00 0.00 0.00 0.00
> 98.13 0.00
> 05:17:48 PM 18 1.98 0.00 0.03 0.00 0.00 0.00 0.00
> 97.99 0.00
> 05:17:48 PM 19 0.80 0.01 0.24 0.00 0.00 0.00 0.00
> 98.95 0.00
> 05:17:48 PM 20 15.50 0.00 0.06 0.00 0.00 0.00 0.00
> 84.44 0.00
> 05:17:48 PM 21 3.34 0.00 0.07 0.00 0.00 0.00 0.00
> 96.58 0.00
> 05:17:48 PM 22 2.97 0.00 0.03 0.00 0.00 0.00 0.00
> 97.00 0.00
> 05:17:48 PM 23 2.35 0.00 0.13 0.00 0.00 0.00 0.00
> 97.52 0.00
> 05:17:48 PM 24 10.61 0.00 0.06 0.00 0.00 0.00 0.00
> 89.33 0.00
> 05:17:48 PM 25 4.52 0.01 0.24 0.02 0.00 0.00 0.00
> 95.21 0.00
> 05:17:48 PM 26 2.44 0.00 0.02 0.00 0.00 0.00 0.00
> 97.54 0.00
> 05:17:48 PM 27 1.54 0.00 0.02 0.00 0.00 0.00 0.00
> 98.43 0.00
> 05:17:48 PM 28 8.58 0.00 0.05 0.01 0.00 0.00 0.00
> 91.37 0.00
> 05:17:48 PM 29 4.75 0.00 0.02 0.01 0.00 0.00 0.00
> 95.23 0.00
> 05:17:48 PM 30 0.80 0.00 0.01 0.01 0.00 0.00 0.00
> 99.18 0.00
> 05:17:48 PM 31 1.99 0.00 0.01 0.01 0.00 0.00 0.00
> 97.99 0.00
>
>
>
More information about the mvapich-discuss
mailing list