[mvapich-discuss] Fwd: Completion of non-blocking collective MPI_Iallgather

Akshay Venkatesh akshay at cse.ohio-state.edu
Wed Jan 28 15:22:41 EST 2015


Hi Pramod,

We investigated the issue you reported and were able to reproduce it on our
local machines. Unfortunately there isn't an immediate fix for this owing
an algorithm characteristic of Iallgather. We are working on an alternative
to allow for better overlap but for the present one way to make possibly
partial progress on the outstanding non-blocking collective is to have a
few rounds of MPI_Test() rather than a single call as you've done in your
code. Hopefully this allows you to benefit from some partial overlap.

On Tue, Jan 27, 2015 at 9:45 AM, pramod kumbhar <pramod.s.kumbhar at gmail.com>
wrote:

> Could someone provide any suggestion on this?
>
> Thanks.
>
> ---------- Forwarded message ----------
> From: Kumbhar Pramod Shivaji <pramod.kumbhar at epfl.ch>
> Date: Mon, Jan 26, 2015 at 6:02 PM
> Subject: [mvapich-discuss] Completion of non-blocking collective
> MPI_Iallgather
> To: "mvapich-discuss at cse.ohio-state.edu" <
> mvapich-discuss at cse.ohio-state.edu>
>
>
>  Hello All,
>
>  I am looking into mpi non-blocking collectives (specifically
> mpi_iallgather & mpi_iallgatherv)
> and wondering about the internal completion / progress of these routines.
> To explain, I have following
> pseudo example:
>
>  MPI_Iallgather(…&req)        *  // this called by all ranks at the same
> time*
> ...
> while( i < 4) {
>
>            compute(…);   // compute for few seconds
>           MPI_Test(…&flag, &req);                               *   //
> Can I expect this flag to be true? i.e. operation is completed here at some
> point?*
>           i++;
> }
> ….
> MPI_Wait(&req);   * // I see that MPI_Iallgather is getting finished when
> everyone calls MPI_Wait at this point, why?*
>
>
>  I was running this test example on single node with 8 mpi ranks. I see
> that MPI_Test is always returning false. With
> compute() function of two seconds, I expect MPI_Iallgather to finish
> quickly. But instead mpi_iallgather is completed
> at MPI_Wait(). See attached sample trace of the program.
>
>  Could anyone point out possible cause of this behaviour? It will be
> great help.
>
>
>  Regards,
> Pramod
>
>  MVAPICH 2 version: 2.0
> Configure command:  /configure --with-slurm
>  --with-slurm-include=slurm/default/include
> --with-slurm-lib=slurm/default/lib  --with-pm=none --with-pmi=slurm
> --enable-threads=multiple --enable-shared --enable-sharedlibs=gcc
> --enable-fc --enable-cxx --with-mpe --enable-rdma-cm --enable-fast
> --enable-smpcoll --with-hwloc --enable-xrc --with-device=ch3:mrail
> --with-rdma=gen2 --enable-cuda --with-cuda=/usr/local/cuda-5.0
> --with-cuda-include=/usr/local/cuda-5.0/include
> --with-cuda-lib=/usr/local/cuda-5.0/lib64 --enable-g=dbg --enable-debuginfo
> CC=gcc CXX=g++ FC=gfortran F77=gfortran
>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


-- 
- Akshay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150128/5d8d14d1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpi_iallgather_trace.png
Type: image/png
Size: 79473 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150128/5d8d14d1/attachment-0001.png>


More information about the mvapich-discuss mailing list