[mvapich-discuss] Fwd: Completion of non-blocking collective MPI_Iallgather

pramod kumbhar pramod.s.kumbhar at gmail.com
Sat Apr 4 05:00:27 EDT 2015


Hi Akshay,

Is this fixed/improved in the latest release?

Regards,
Pramod

On Fri, Jan 30, 2015 at 12:27 AM, pramod kumbhar <pramod.s.kumbhar at gmail.com
> wrote:

> Hi Akshay,
>
> Thanks for update. I tried this and it seems better now now with extra
> MPI_Test.
>
> -Pramod
>
>
> On Wed, Jan 28, 2015 at 9:22 PM, Akshay Venkatesh <
> akshay at cse.ohio-state.edu> wrote:
>
>> Hi Pramod,
>>
>> We investigated the issue you reported and were able to reproduce it on
>> our local machines. Unfortunately there isn't an immediate fix for this
>> owing an algorithm characteristic of Iallgather. We are working on an
>> alternative to allow for better overlap but for the present one way to make
>> possibly partial progress on the outstanding non-blocking collective is to
>> have a few rounds of MPI_Test() rather than a single call as you've done in
>> your code. Hopefully this allows you to benefit from some partial overlap.
>>
>> On Tue, Jan 27, 2015 at 9:45 AM, pramod kumbhar <
>> pramod.s.kumbhar at gmail.com> wrote:
>>
>>> Could someone provide any suggestion on this?
>>>
>>> Thanks.
>>>
>>> ---------- Forwarded message ----------
>>> From: Kumbhar Pramod Shivaji <pramod.kumbhar at epfl.ch>
>>> Date: Mon, Jan 26, 2015 at 6:02 PM
>>> Subject: [mvapich-discuss] Completion of non-blocking collective
>>> MPI_Iallgather
>>> To: "mvapich-discuss at cse.ohio-state.edu" <
>>> mvapich-discuss at cse.ohio-state.edu>
>>>
>>>
>>>  Hello All,
>>>
>>>  I am looking into mpi non-blocking collectives (specifically
>>> mpi_iallgather & mpi_iallgatherv)
>>> and wondering about the internal completion / progress of these
>>> routines. To explain, I have following
>>> pseudo example:
>>>
>>>  MPI_Iallgather(…&req)        *  // this called by all ranks at the
>>> same time*
>>> ...
>>> while( i < 4) {
>>>
>>>            compute(…);   // compute for few seconds
>>>           MPI_Test(…&flag, &req);                               *   //
>>> Can I expect this flag to be true? i.e. operation is completed here at some
>>> point?*
>>>           i++;
>>> }
>>> ….
>>> MPI_Wait(&req);   * // I see that MPI_Iallgather is getting finished
>>> when everyone calls MPI_Wait at this point, why?*
>>>
>>>
>>>  I was running this test example on single node with 8 mpi ranks. I see
>>> that MPI_Test is always returning false. With
>>> compute() function of two seconds, I expect MPI_Iallgather to finish
>>> quickly. But instead mpi_iallgather is completed
>>> at MPI_Wait(). See attached sample trace of the program.
>>>
>>>  Could anyone point out possible cause of this behaviour? It will be
>>> great help.
>>>
>>>
>>>  Regards,
>>> Pramod
>>>
>>>  MVAPICH 2 version: 2.0
>>> Configure command:  /configure --with-slurm
>>>  --with-slurm-include=slurm/default/include
>>> --with-slurm-lib=slurm/default/lib  --with-pm=none --with-pmi=slurm
>>> --enable-threads=multiple --enable-shared --enable-sharedlibs=gcc
>>> --enable-fc --enable-cxx --with-mpe --enable-rdma-cm --enable-fast
>>> --enable-smpcoll --with-hwloc --enable-xrc --with-device=ch3:mrail
>>> --with-rdma=gen2 --enable-cuda --with-cuda=/usr/local/cuda-5.0
>>> --with-cuda-include=/usr/local/cuda-5.0/include
>>> --with-cuda-lib=/usr/local/cuda-5.0/lib64 --enable-g=dbg --enable-debuginfo
>>> CC=gcc CXX=g++ FC=gfortran F77=gfortran
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>>
>>
>> --
>> - Akshay
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150404/769ee504/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mpi_iallgather_trace.png
Type: image/png
Size: 79473 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150404/769ee504/attachment-0001.png>


More information about the mvapich-discuss mailing list