[mvapich-discuss] mvapich alltoall(v)

Fri Apr 25 04:10:21 EDT 2008

Hi all,

In a large IB fabric we are observing application performance 
degradation (using mvapich 0.9.9) that seem to be correlated to running 
MPI_Alltoall or MPI_Alltoallv. To get a uniform traffic pattern 
MPI_Alltoall(v) might be aka' *):

MPI_Alltoall(v)
{
     MPI_Sendrecv(to myself);
     if (size > 1) {
	for (i = 1; i < size, i++)
		MPI_Irecv(source = (rank+i) % size);
     	for (i = 1; i < size, i++)
		MPI_Isend(destin = (size + rank - i) % size);
     	MPI_WaitAll(2*(size-1));
     }
}

I assume that the collectives are implemented in the int??_fns.c files 
in the source tree. For the default implementation of MPI_Alltoallv in 
mvapich-0.9.9/src/coll/intra_fns.c things look ok'ish, but for what I 
assume is the IB relevant implementation in 
mvapich-0.9.9/mpid/vapi/intra_fns.c the source & destin indexes are 
equal to the loop-counter i.e. aka' stressing one MPI process at the 
time. Can someone please confirm or disprove my observation.

/lars paul

PS! Please bear with me for my lack of basic MVAPICH knowledge.

*) other source/destin sequencing might give better performance :-)