[mvapich-discuss] mvapich2-2.0.1 performs very poorly on old mthca0 nodes when involving multiple CPUs per node.

khaled hamidouche hamidouc at cse.ohio-state.edu
Tue Mar 31 17:08:09 EDT 2015


Hi Limin,

With 2.0.1 can you please try with explicit binding of the processes using
MV2_CPU_MAPPING=0,1,2 ....

Please refer to this section
http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.1rc2-userguide.html#x1-17000011.13


Please let us know if this is helping.

Thanks

On Tue, Mar 31, 2015 at 4:05 PM, Limin Gu <lgu at penguincomputing.com> wrote:

> Hi All,
>
> I have encountered a problem with mvapich2-2.0.1.
>
> I use osu benchmark for performance test on the nodes which have old
> mellanox cards with mthca0. Mvapich2-2.0.0 performs fine on those nodes,
> but mvapich2-2.0.1 is compiled the same way, but performs horribly, it
> takes 1000 times longer than mvapich2-2.0.0. The problem happens when I use
> more than one CPU per node.
>
> For example: "mpirun_rsh -np 2 n0 n1 ./osu_alltoall" runs OK for both
> mvapich2-2.0.0 and mvapich2-2.0.1
> but "mpirun_rsh -np 4 n0 n0 n1 n1 ./osu_alltoall" runs OK for
> mvapich2-2.0.0, but horrible for mvapich2-2.0.1
>
> I also tried on nodes with newer mellanox card with mlx4_0, mvapich2-2.0.1
> performs OK with multiple CPUs per node.
>
> Does anyone else have the same problem? Is this problem related to
> hardware?
>
> Thanks!
>
> Limin
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150331/4d8d9b08/attachment.html>


More information about the mvapich-discuss mailing list