[mvapich-discuss] mvapich alltoall(v)
amith rajith mamidala
mamidala at cse.ohio-state.edu
Fri Apr 25 11:08:33 EDT 2008
Hi Lars,
We have optimized MPI_Alltoall and MPI_Alltoallv in the latest mvapich-1.0
branch. Can you download and see if you are seeing better performance?
Also, I assume that you are using gen2 device and not vapi.
Thanks,
Amith
On Fri, 25 Apr 2008, Lars Paul Huse wrote:
> Hi all,
>
> In a large IB fabric we are observing application performance
> degradation (using mvapich 0.9.9) that seem to be correlated to running
> MPI_Alltoall or MPI_Alltoallv. To get a uniform traffic pattern
> MPI_Alltoall(v) might be aka' *):
>
>
> MPI_Alltoall(v)
> {
> MPI_Sendrecv(to myself);
> if (size > 1) {
> for (i = 1; i < size, i++)
> MPI_Irecv(source = (rank+i) % size);
> for (i = 1; i < size, i++)
> MPI_Isend(destin = (size + rank - i) % size);
> MPI_WaitAll(2*(size-1));
> }
> }
>
> I assume that the collectives are implemented in the int??_fns.c files
> in the source tree. For the default implementation of MPI_Alltoallv in
> mvapich-0.9.9/src/coll/intra_fns.c things look ok'ish, but for what I
> assume is the IB relevant implementation in
> mvapich-0.9.9/mpid/vapi/intra_fns.c the source & destin indexes are
> equal to the loop-counter i.e. aka' stressing one MPI process at the
> time. Can someone please confirm or disprove my observation.
>
> /lars paul
>
> PS! Please bear with me for my lack of basic MVAPICH knowledge.
>
> *) other source/destin sequencing might give better performance :-)
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list