[mvapich-discuss] mvapich alltoall(v)

Fri Apr 25 11:08:33 EDT 2008

Hi Lars,

We have optimized MPI_Alltoall and MPI_Alltoallv in the latest mvapich-1.0
branch. Can you download and see if you are seeing better performance?

Also, I assume that you are using gen2 device and not vapi.

Thanks,
Amith

On Fri, 25 Apr 2008, Lars Paul Huse wrote:

> Hi all,
>
> In a large IB fabric we are observing application performance
> degradation (using mvapich 0.9.9) that seem to be correlated to running
> MPI_Alltoall or MPI_Alltoallv. To get a uniform traffic pattern
> MPI_Alltoall(v) might be aka' *):
>
>
> MPI_Alltoall(v)
> {
>      MPI_Sendrecv(to myself);
>      if (size > 1) {
> 	for (i = 1; i < size, i++)
> 		MPI_Irecv(source = (rank+i) % size);
>      	for (i = 1; i < size, i++)
> 		MPI_Isend(destin = (size + rank - i) % size);
>      	MPI_WaitAll(2*(size-1));
>      }
> }
>
> I assume that the collectives are implemented in the int??_fns.c files
> in the source tree. For the default implementation of MPI_Alltoallv in
> mvapich-0.9.9/src/coll/intra_fns.c things look ok'ish, but for what I
> assume is the IB relevant implementation in
> mvapich-0.9.9/mpid/vapi/intra_fns.c the source & destin indexes are
> equal to the loop-counter i.e. aka' stressing one MPI process at the
> time. Can someone please confirm or disprove my observation.
>
> /lars paul
>
> PS! Please bear with me for my lack of basic MVAPICH knowledge.
>
> *) other source/destin sequencing might give better performance :-)
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>