[mvapich-discuss] mvapich alltoall(v)
Lars Paul Huse
Lars.Paul.Huse at Sun.COM
Fri Apr 25 04:10:21 EDT 2008
Hi all,
In a large IB fabric we are observing application performance
degradation (using mvapich 0.9.9) that seem to be correlated to running
MPI_Alltoall or MPI_Alltoallv. To get a uniform traffic pattern
MPI_Alltoall(v) might be aka' *):
MPI_Alltoall(v)
{
MPI_Sendrecv(to myself);
if (size > 1) {
for (i = 1; i < size, i++)
MPI_Irecv(source = (rank+i) % size);
for (i = 1; i < size, i++)
MPI_Isend(destin = (size + rank - i) % size);
MPI_WaitAll(2*(size-1));
}
}
I assume that the collectives are implemented in the int??_fns.c files
in the source tree. For the default implementation of MPI_Alltoallv in
mvapich-0.9.9/src/coll/intra_fns.c things look ok'ish, but for what I
assume is the IB relevant implementation in
mvapich-0.9.9/mpid/vapi/intra_fns.c the source & destin indexes are
equal to the loop-counter i.e. aka' stressing one MPI process at the
time. Can someone please confirm or disprove my observation.
/lars paul
PS! Please bear with me for my lack of basic MVAPICH knowledge.
*) other source/destin sequencing might give better performance :-)
More information about the mvapich-discuss
mailing list