[mvapich-discuss] Fwd: Completion of non-blocking collective MPI_Iallgather

pramod kumbhar pramod.s.kumbhar at gmail.com
Thu Jan 29 18:27:04 EST 2015


Hi Akshay,

Thanks for update. I tried this and it seems better now now with extra
MPI_Test.

-Pramod

On Wed, Jan 28, 2015 at 9:22 PM, Akshay Venkatesh <akshay at cse.ohio-state.edu
> wrote:

> Hi Pramod,
>
> We investigated the issue you reported and were able to reproduce it on
> our local machines. Unfortunately there isn't an immediate fix for this
> owing an algorithm characteristic of Iallgather. We are working on an
> alternative to allow for better overlap but for the present one way to make
> possibly partial progress on the outstanding non-blocking collective is to
> have a few rounds of MPI_Test() rather than a single call as you've done in
> your code. Hopefully this allows you to benefit from some partial overlap.
>
> On Tue, Jan 27, 2015 at 9:45 AM, pramod kumbhar <
> pramod.s.kumbhar at gmail.com> wrote:
>
>> Could someone provide any suggestion on this?
>>
>> Thanks.
>>
>> ---------- Forwarded message ----------
>> From: Kumbhar Pramod Shivaji <pramod.kumbhar at epfl.ch>
>> Date: Mon, Jan 26, 2015 at 6:02 PM
>> Subject: [mvapich-discuss] Completion of non-blocking collective
>> MPI_Iallgather
>> To: "mvapich-discuss at cse.ohio-state.edu" <
>> mvapich-discuss at cse.ohio-state.edu>
>>
>>
>>  Hello All,
>>
>>  I am looking into mpi non-blocking collectives (specifically
>> mpi_iallgather & mpi_iallgatherv)
>> and wondering about the internal completion / progress of these routines.
>> To explain, I have following
>> pseudo example:
>>
>>  MPI_Iallgather(…&req)        *  // this called by all ranks at the same
>> time*
>> ...
>> while( i < 4) {
>>
>>            compute(…);   // compute for few seconds
>>           MPI_Test(…&flag, &req);                               *   //
>> Can I expect this flag to be true? i.e. operation is completed here at some
>> point?*
>>           i++;
>> }
>> ….
>> MPI_Wait(&req);   * // I see that MPI_Iallgather is getting finished
>> when everyone calls MPI_Wait at this point, why?*
>>
>>
>>  I was running this test example on single node with 8 mpi ranks. I see
>> that MPI_Test is always returning false. With
>> compute() function of two seconds, I expect MPI_Iallgather to finish
>> quickly. But instead mpi_iallgather is completed
>> at MPI_Wait(). See attached sample trace of the program.
>>
>>  Could anyone point out possible cause of this behaviour? It will be
>> great help.
>>
>>
>>  Regards,
>> Pramod
>>
>>  MVAPICH 2 version: 2.0
>> Configure command:  /configure --with-slurm
>>  --with-slurm-include=slurm/default/include
>> --with-slurm-lib=slurm/default/lib  --with-pm=none --with-pmi=slurm
>> --enable-threads=multiple --enable-shared --enable-sharedlibs=gcc
>> --enable-fc --enable-cxx --with-mpe --enable-rdma-cm --enable-fast
>> --enable-smpcoll --with-hwloc --enable-xrc --with-device=ch3:mrail
>> --with-rdma=gen2 --enable-cuda --with-cuda=/usr/local/cuda-5.0
>> --with-cuda-include=/usr/local/cuda-5.0/include
>> --with-cuda-lib=/usr/local/cuda-5.0/lib64 --enable-g=dbg --enable-debuginfo
>> CC=gcc CXX=g++ FC=gfortran F77=gfortran
>>
>>
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>
>
> --
> - Akshay
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150130/1b1a7639/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpi_iallgather_trace.png
Type: image/png
Size: 79473 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150130/1b1a7639/attachment-0001.png>


More information about the mvapich-discuss mailing list