[mvapich-discuss] Performance issue

Devendar Bureddy bureddy at cse.ohio-state.edu
Mon Jun 6 13:14:43 EDT 2011


Hi Mashhiro,

Can you please try your experiment with MV2_ENABLE_AFFINITY=1 ( default
setting) ?  Please let us know the result.

Thanks
Devendar

On Sun, Jun 5, 2011 at 10:48 AM, Dhabaleswar Panda <panda at cse.ohio-state.edu
> wrote:

> Thanks for your reply. We will take a look at it and get back to you.
>
> Thanks,
>
> DK
>
> On Sun, 5 Jun 2011, Masahiro Nakao wrote:
>
> > Dear Professor Dhabaleswar Panda,
> >
> > Thank you for your answer.
> >
> > 2011/6/5 Dhabaleswar Panda <panda at cse.ohio-state.edu>:
> > > Thanks for your note. Could you tell us whether you are using
> single-rail
> > > or multi-rail environment.
> >
> > I used single-rail environment.
> >
> > The environment variables are as bellow.
> > --
> > - export MV2_NUM_HCAS=1
> > - export MV2_USE_SHARED_MEM=1
> > - export MV2_ENABLE_AFFINITY=0
> > - export MV2_NUM_PORTS=1
> > --
> > The compile option is only "-O3".
> > ---
> > mpicc -O3 hoge.c -o hoge
> > mpirun_rsh -np 2 -hostfile hosts hoge
> > ---
> >
> > Actually, I had tried to use 4-rails environment before.
> > Then I had changed below environment.
> > MV2_NUM_HCAS=1 -> MV2_NUM_HCAS=4
> > But the results are almost the same.
> > ---
> > 64 Byte:4.0 MByte/s
> > 128 Byte:7.2 MByte/s
> > 256 Byte:12.8 MByte/s
> > 512 Byte:28.3 MByte/s
> > 1024 Byte:57.3 MByte/s
> > 2048 Byte:93.4 MByte/s
> > 4096 Byte:136.3 MByte/s
> > 8192 Byte:216.1 MByte/s
> > 16384 Byte:40.0 MByte/s
> > 32768 Byte:72.0 MByte/s
> > ---
> > This is an another question.
> > Why is the performance of 4-rails worse than that of 1-rail ?
> >
> > > Is this being run on the Tsukuba cluster?
> >
> > Yes, it is :).
> >
> > Regard,
> >
> > > DK
> > >
> > > On Sun, 5 Jun 2011, Masahiro Nakao wrote:
> > >
> > >> Dear all,
> > >>
> > >> I use mvapich2-1.7a.
> > >> I tried to measure a troughtput performance,
> > >> but the value of performance I don't understand.
> > >>
> > >> The source code is as below.
> > >> ---
> > >> double tmp[SIZE];
> > >>   :
> > >> MPI_Barrier(MPI_COMM_WORLD);
> > >> t1 = gettimeofday_sec();
> > >> if(rank==0)   MPI_Send(tmp, SIZE, MPI_DOUBLE, 1, 999, MPI_COMM_WORLD
> );
> > >> else  MPI_Recv(tmp, SIZE, MPI_DOUBLE, 0, 999, MPI_COMM_WORLD, &status
> );
> > >> MPI_Barrier(MPI_COMM_WORLD);
> > >> t2 = gettimeofday_sec();
> > >>
> > >> printf("%d Byte:%.1f MByte/s\n", sizeof(tmp),
> sizeof(tmp)/(t2-t1)/1000000);
> > >> ---
> > >> This program was run on 2 nodes.
> > >>
> > >> Results are as below. SIZE = 1, 2, 4, ... , 4096
> > >> ---
> > >>     8 Byte:  0.6 MByte/s
> > >>    16 Byte:  1.1 MByte/s
> > >>    32 Byte:  2.1 MByte/s
> > >>    64 Byte:  4.5 MByte/s
> > >>   128 Byte:  8.0 MByte/s
> > >>   256 Byte: 15.1 MByte/s
> > >>   512 Byte: 30.2 MByte/s
> > >>  1024 Byte: 68.2 MByte/s
> > >>  2048 Byte:102.3 MByte/s
> > >>  4096 Byte:157.6 MByte/s
> > >>  8192 Byte:195.2 MByte/s
> > >> 16384 Byte: 80.8 MByte/s
> > >> 32768 Byte:142.4 MByte/s
> > >> ---
> > >>
> > >> Why is the troughtput down, when the value of transfar size is 16384 ?
> > >>
> > >> My environment is as below.
> > >> ---
> > >> - Quad-Core AMD Opteron(tm) Processor 8356
> > >> - DDR Infiniband (Mellanox ConnectX)
> > >> ---
> > >>
> > >> Regard,
> > >> --
> > >> Masahiro NAKAO
> > >> Email : mnakao at ccs.tsukuba.ac.jp
> > >> Researcher
> > >> Center for Computational Sciences
> > >> UNIVERSITY OF TSUKUBA
> > >> _______________________________________________
> > >> mvapich-discuss mailing list
> > >> mvapich-discuss at cse.ohio-state.edu
> > >> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > >>
> > >
> > >
> >
> >
> >
> > --
> > Masahiro NAKAO
> > Email : mnakao at ccs.tsukuba.ac.jp
> > Researcher
> > Center for Computational Sciences
> > UNIVERSITY OF TSUKUBA
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20110606/94bd32d8/attachment.html


More information about the mvapich-discuss mailing list