[mvapich-discuss] Rndv Receiver is receiving less than as expected

Aaron Knister aaron.knister at gmail.com
Mon Jun 28 18:11:53 EDT 2010


Hi,

I'm running mvapich2-1.4rc2 using SLURM as the PMI and having some 
difficulties with gromacs-4.0.7. I can't find the exact number but with 
processor counts somewhere after 40-- definitely 80 and higher the 
gromacs application terminates after some time (the amount of time 
varies slightly between runs) with this error:


Warning! Rndv Receiver is receiving (13760 < 24768) less than as expected
Fatal error in MPI_Alltoall:
Message truncated, error stack:
MPI_Alltoall(734)......................: MPI_Alltoall(sbuf=0x1672840, 
scount=344, MPI_FLOAT, rbuf=0x2aaaad349360, rcount=344, MPI_FLOAT, 
comm=0xc4000000) failed
MPIR_Alltoall(193).....................:
MPIDI_CH3U_Post_data_receive_found(445): Message from rank 21 and tag 9 
truncated; 24768 bytes received but buffer size is 13760
Warning! Rndv Receiver is receiving (22016 < 27520) less than as expected
Fatal error in MPI_Alltoall:
Message truncated, error stack:
MPI_Alltoall(734)......................: 
MPI_Alltoall(sbuf=0x2aaaad3ce4e0, scount=344, MPI_FLOAT, 
rbuf=0x1e6af900, rcount=344, MPI_FLOAT, comm=0xc4000004) failed
MPIR_Alltoall(193).....................:
MPIDI_CH3U_Post_data_receive_found(445): Message from rank 17 and tag 9 
truncated; 27520 bytes received but buffer size is 22016

The sizes of the buffers aren't identical each time, but the rank 
numbers that throw the errors seem to be consistent. The error doesn't 
occur with OpenMPI which interestingly runs the code significantly 
faster than mvapich2 although I don't know why. I've also tried 
mvapich2-1.5rc2 and the error is still present. Please let me know if 
you need any additional information from me.

Thanks in advance!

-Aaron


More information about the mvapich-discuss mailing list