[mvapich-discuss] FATAL event IBV_EVENT_QP_LAST_WQE_REACHED

Pierrick Penven Pierrick.Penven at ird.fr
Tue Mar 27 12:33:03 EDT 2007


Dear all,

I am trying to install an ocean model on a  cluster based on 64 bit bi-dual 
core AMD opterons with infiniband using pathf90 and mvapich v0.9.8.

The model is runs and scales well on 1 node, but is not able to run on several 
nodes.  I have tried to used mvapich v0.9.9.beta, and I get the following 
message:

[1:chpcc060] Abort: [1:chpcc060] Abort: [chpcc060:1] Got completion with error 
IBV_WC_LOC_PROT_ERR, code=4 at line 2374 in file viacheck.c
[1] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16 at line 2554 in 
file viacheck.c
[0:chpcc058] Abort: [0] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16
 at line 2554 in file viacheck.c
/CHPC/usr/local/mvapich_099b/bin/mpirun: line 1: 17412 
Terminated              /CHPC/usr/local/mvapich_099b/bin/mpirun_rsh -np 
2 -hostfile /CHPC/home/loadl/execute/chpcln.4269.0.machinefile /CHPC/home/ppenven/Roms_tools/TEST1/./roms

The problem does not occur using the MPI over IP rather than VAPI.

Is there a solution to this problem ?

Thanks a lot

Pierrick


More information about the mvapich-discuss mailing list