[mvapich-discuss] Problem with more MPI jobs on the same node

Dhabaleswar Panda panda at cse.ohio-state.edu
Sat Aug 29 10:31:31 EDT 2009


Which interface of mvapich 1.1.0 you are using - Gen2 or Gen2-hybrid? If
you are using `Gen2' interface, VIADEV_USE_AFFINITY=0 should be disabling
affinity. For Gen2-Hybrid, the variable is MV_USE_AFFINITY. Also, for Gen2
interface, there is a CPU mapping option VIADEV_CPU_MAPPING through which
you can actually run an MPI job on a specified set of cores. Can you try
this option to make sure that different MPI jobs can explicitly get mapped
to different cores.

DK

On Sat, 29 Aug 2009, Emir Imamagic wrote:

> Hello,
>
> we have a problem with running multiple MPI jobs on the same node. We're
> using mvapich 1.1.0 on CentOS 5.3 compiled with Intel 11.1. Nodes are 32
> core Opterons.
>
> We used NPB LU benchmark compiled for 8 processes. With each new job
> started on the node, CPU usage of all processes decreases (we retrieved
> it by top). It seems that individual MPI processes are assigned to the
> same core (as described). This behaves consistently with the increase of
> jobs:
> 2 jobs - 50% CPU usage (2*app runtime)
> 3 jobs - 33% CPU usage (3*app runtime)
> 4 jobs - 25% CPU usage (4*app runtime)
>
> Problem is also described in this thread:
> http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2009-April/002251.html
> However, suggested solution does not solve the problem. We set the
> VIADEV_USE_AFFINITY=0. We even changed the source code
> (mpid/ch_gen2/viaparam.h):
> #define _AFFINITY_ 0
> Nothing helped.
>
> Thanks in advance,
> emir
>
>
>
>



More information about the mvapich-discuss mailing list