[mvapich-discuss] The performance of MPI_Waitall

Akshay Venkatesh akshay at cse.ohio-state.edu
Tue Dec 23 23:39:35 EST 2014


林灯,

Non-blocking collectives make true communication progress in the background
mostly when assisted by hardware engines or dedicated threads. What you're
seeing here is that the original communication time of 2.5s being split
into 0.3s of issuing non-blocking operations and 2.2s of progressing the
actual communication steps. Ideally, with external communication offload
agents you should see the 2.2s reduce to minimal values but these will made
available with MVAPICH in the future. What you could try for the time being
is to insert MPI_Test (http://mpi.deino.net/mpi_functions/MPI_Test.html) or
MPI_Testany calls within your computation region to allow for some progress
to take place without necessarily having to rely entirely on MPI_Waitall
for all of the communication progress. This may increase the computation
phase time but hopefully the aggregate time reduces below 32.5s. Please let
us know if this helps.

Thanks

On Tue, Dec 23, 2014 at 10:19 PM, 林灯 <lind at lsec.cc.ac.cn> wrote:

>
> Hi all,
>     I am a user of MVAPICH2. Recently I am trying to use non-blocking
> communication to cover the cost of communication. However, something
> unexpectable happens and I get stuck in it.  The problem focuses on the
> performance of MPI_Waitall.
>
> The structure of my code is like this
>
> for(ts = 0; ts < 2000; ts++)
> {
> //...
> for(...)
> {
> //do some calculation
> }
> {
>          sparse_ialltoallv(...);// This is a customized non-blocking
> neighborhood communication function like MPI_Ineighbor_alltoallv.
>          // It simply uses MPI_Isend and MPI_Irecv in case that some
> clusters don’t provide MPI systems supporting MPI 3.0
>
> //do some calculation to cover the cost of communication
> //the total cost of communication is about 2.5s and the total cost of
> calculation is about 30s.
> //Thus, I think the calculation is large enough to cover the communication.
>  MPI_Waitall(...);
> }
> //...
> }
>
> In practice, I find that the cost of communication I can cover is about
> 0.3s while the cost of MPI_Waitall (2000 times) is nearly up to 2.2s and
> both cost increase with the number of processes in my cluster. In this
> sense, some part of the cost of communication, e.g. the cost of
> MPI_Waitall, can not be covered. Is that true? What actually does
> MPI_Waitall do? Is there anything I can do to reduce the cost of
> MPI_Waitall or replace it with other MPI API(s)?
>
>
> Deng Lin
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


-- 
- Akshay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141223/bc2fd408/attachment.html>


More information about the mvapich-discuss mailing list