[mvapich-discuss] Fwd: Completion of non-blocking collective MPI_Iallgather

Akshay Venkatesh akshay at cse.ohio-state.edu
Sun Apr 5 02:14:26 EDT 2015


Hi Pramod,

The recent release hasn't addressed this particular issue yet and is aimed
for the 2.2 version instead.

Thanks
On Apr 4, 2015 5:00 AM, "pramod kumbhar" <pramod.s.kumbhar at gmail.com> wrote:

> Hi Akshay,
>
> Is this fixed/improved in the latest release?
>
> Regards,
> Pramod
>
> On Fri, Jan 30, 2015 at 12:27 AM, pramod kumbhar <
> pramod.s.kumbhar at gmail.com> wrote:
>
>> Hi Akshay,
>>
>> Thanks for update. I tried this and it seems better now now with extra
>> MPI_Test.
>>
>> -Pramod
>>
>>
>> On Wed, Jan 28, 2015 at 9:22 PM, Akshay Venkatesh <
>> akshay at cse.ohio-state.edu> wrote:
>>
>>> Hi Pramod,
>>>
>>> We investigated the issue you reported and were able to reproduce it on
>>> our local machines. Unfortunately there isn't an immediate fix for this
>>> owing an algorithm characteristic of Iallgather. We are working on an
>>> alternative to allow for better overlap but for the present one way to make
>>> possibly partial progress on the outstanding non-blocking collective is to
>>> have a few rounds of MPI_Test() rather than a single call as you've done in
>>> your code. Hopefully this allows you to benefit from some partial overlap.
>>>
>>> On Tue, Jan 27, 2015 at 9:45 AM, pramod kumbhar <
>>> pramod.s.kumbhar at gmail.com> wrote:
>>>
>>>> Could someone provide any suggestion on this?
>>>>
>>>> Thanks.
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Kumbhar Pramod Shivaji <pramod.kumbhar at epfl.ch>
>>>> Date: Mon, Jan 26, 2015 at 6:02 PM
>>>> Subject: [mvapich-discuss] Completion of non-blocking collective
>>>> MPI_Iallgather
>>>> To: "mvapich-discuss at cse.ohio-state.edu" <
>>>> mvapich-discuss at cse.ohio-state.edu>
>>>>
>>>>
>>>>  Hello All,
>>>>
>>>>  I am looking into mpi non-blocking collectives (specifically
>>>> mpi_iallgather & mpi_iallgatherv)
>>>> and wondering about the internal completion / progress of these
>>>> routines. To explain, I have following
>>>> pseudo example:
>>>>
>>>>  MPI_Iallgather(…&req)        *  // this called by all ranks at the
>>>> same time*
>>>> ...
>>>> while( i < 4) {
>>>>
>>>>            compute(…);   // compute for few seconds
>>>>           MPI_Test(…&flag, &req);                               *   //
>>>> Can I expect this flag to be true? i.e. operation is completed here at some
>>>> point?*
>>>>           i++;
>>>> }
>>>> ….
>>>> MPI_Wait(&req);   * // I see that MPI_Iallgather is getting finished
>>>> when everyone calls MPI_Wait at this point, why?*
>>>>
>>>>
>>>>  I was running this test example on single node with 8 mpi ranks. I
>>>> see that MPI_Test is always returning false. With
>>>> compute() function of two seconds, I expect MPI_Iallgather to finish
>>>> quickly. But instead mpi_iallgather is completed
>>>> at MPI_Wait(). See attached sample trace of the program.
>>>>
>>>>  Could anyone point out possible cause of this behaviour? It will be
>>>> great help.
>>>>
>>>>
>>>>  Regards,
>>>> Pramod
>>>>
>>>>  MVAPICH 2 version: 2.0
>>>> Configure command:  /configure --with-slurm
>>>>  --with-slurm-include=slurm/default/include
>>>> --with-slurm-lib=slurm/default/lib  --with-pm=none --with-pmi=slurm
>>>> --enable-threads=multiple --enable-shared --enable-sharedlibs=gcc
>>>> --enable-fc --enable-cxx --with-mpe --enable-rdma-cm --enable-fast
>>>> --enable-smpcoll --with-hwloc --enable-xrc --with-device=ch3:mrail
>>>> --with-rdma=gen2 --enable-cuda --with-cuda=/usr/local/cuda-5.0
>>>> --with-cuda-include=/usr/local/cuda-5.0/include
>>>> --with-cuda-lib=/usr/local/cuda-5.0/lib64 --enable-g=dbg --enable-debuginfo
>>>> CC=gcc CXX=g++ FC=gfortran F77=gfortran
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>
>>>>
>>>
>>>
>>> --
>>> - Akshay
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150405/6d74a11e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpi_iallgather_trace.png
Type: image/png
Size: 79473 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150405/6d74a11e/attachment-0001.png>


More information about the mvapich-discuss mailing list