[mvapich-discuss] error associated with MPI_BCAST and MPI_SEND?

Raymond Richardson rayrick1 at gmail.com
Fri Mar 21 09:29:45 EDT 2008


Hi all,

I'm hoping one of you developers out there may have seen some behavior of
the sort that's plaguing me at present. I'm running a large, complex
numerical weather prediction model that is written in fortran and uses mpi
for parallelization.  I'm running on an opteron cluster running linux and
using infiniband.  I'm using mvapich 0.9.9 and the pathscale 3.1 fortran
compiler.

What I'm seeing is intermittent crashes with the following error:

[cm.c: line 142]Couldn't create RC QP

Compiling my code with things like trapuv and checkbounds doesn't reveal any
problems.  Doing things like changing the order of nodes in my node list, or
adding print statements, will change when and where these crashes happen.
They seem to be associated with MPI_BCAST and MPI_SEND commands.  I have an
older cluster using mpich and myrinet and the code runs fine there.  Does
this mean anything to anyone out there?

Thanks a lot,

Ray Richardson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080321/b39fc9b1/attachment.html


More information about the mvapich-discuss mailing list