[mvapich-discuss] FATAL event IBV_EVENT_QP_LAST_WQE_REACHED

Nathan Dauchy Nathan.Dauchy at noaa.gov
Fri Aug 3 19:19:33 EDT 2007


Pierrick, All,

We recently upgraded to OFED-1.2 (mvapich-0.9.9) and are now getting an
error that looks similar to yours:

[0:w72] Abort: [0] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED,
code=16 at line 2552 in file viacheck.c

Did you ever find a solution?  (I can't find one in the archives.)

Can someone explain what the IBV_EVENT_QP_LAST_WQE_REACHED error means?
  I can't find any clues in the source, and have been unable to turn up
any relevant docs either.

Thanks for any help and clues you can offer!

Regards,
Nathan


Pierrick Penven, Tue Mar 27 12:33:03 EDT 2007:
> Dear all,
> 
> I am trying to install an ocean model on a  cluster based on 64 bit bi-dual 
> core AMD opterons with infiniband using pathf90 and mvapich v0.9.8.
> 
> The model is runs and scales well on 1 node, but is not able to run on several 
> nodes.  I have tried to used mvapich v0.9.9.beta, and I get the following 
> message:
> 
> [1:chpcc060] Abort: [1:chpcc060] Abort: [chpcc060:1] Got completion with error 
> IBV_WC_LOC_PROT_ERR, code=4 at line 2374 in file viacheck.c
> [1] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16 at line 2554 in 
> file viacheck.c
> [0:chpcc058] Abort: [0] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16
>  at line 2554 in file viacheck.c
> /CHPC/usr/local/mvapich_099b/bin/mpirun: line 1: 17412 
> Terminated              /CHPC/usr/local/mvapich_099b/bin/mpirun_rsh -np 
> 2 -hostfile /CHPC/home/loadl/execute/chpcln.4269.0.machinefile /CHPC/home/ppenven/Roms_tools/TEST1/./roms
> 
> The problem does not occur using the MPI over IP rather than VAPI.
> 
> Is there a solution to this problem ?
> 
> Thanks a lot
> 
> Pierrick



More information about the mvapich-discuss mailing list