[mvapich-discuss] FATAL event IBV_EVENT_QP_LAST_WQE_REACHED

Sayantan Sur surs at cse.ohio-state.edu
Sat Aug 4 15:27:21 EDT 2007


Hi Nathan,

Nathan Dauchy wrote:
> Pierrick, All,
>
> We recently upgraded to OFED-1.2 (mvapich-0.9.9) and are now getting an
> error that looks similar to yours:
>
> [0:w72] Abort: [0] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED,
> code=16 at line 2552 in file viacheck.c
>
> Did you ever find a solution?  (I can't find one in the archives.)
>
> Can someone explain what the IBV_EVENT_QP_LAST_WQE_REACHED error means?
>   I can't find any clues in the source, and have been unable to turn up
> any relevant docs either.
>
> Thanks for any help and clues you can offer!
>   


Thanks for reporting the problem. The event 
IBV_EVENT_QP_LAST_WQE_REACHED means that the QP (internal InfiniBand 
communication channel) is in an error state and all requests are 
consumed. Could it be related to a setup issue? Can you run any other 
MPI programs such as OSU benchmarks, IMB etc. on all these nodes?

Thanks,
Sayantan.

> Regards,
> Nathan
>
>
> Pierrick Penven, Tue Mar 27 12:33:03 EDT 2007:
>   
>> Dear all,
>>
>> I am trying to install an ocean model on a  cluster based on 64 bit bi-dual 
>> core AMD opterons with infiniband using pathf90 and mvapich v0.9.8.
>>
>> The model is runs and scales well on 1 node, but is not able to run on several 
>> nodes.  I have tried to used mvapich v0.9.9.beta, and I get the following 
>> message:
>>
>> [1:chpcc060] Abort: [1:chpcc060] Abort: [chpcc060:1] Got completion with error 
>> IBV_WC_LOC_PROT_ERR, code=4 at line 2374 in file viacheck.c
>> [1] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16 at line 2554 in 
>> file viacheck.c
>> [0:chpcc058] Abort: [0] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16
>>  at line 2554 in file viacheck.c
>> /CHPC/usr/local/mvapich_099b/bin/mpirun: line 1: 17412 
>> Terminated              /CHPC/usr/local/mvapich_099b/bin/mpirun_rsh -np 
>> 2 -hostfile /CHPC/home/loadl/execute/chpcln.4269.0.machinefile /CHPC/home/ppenven/Roms_tools/TEST1/./roms
>>
>> The problem does not occur using the MPI over IP rather than VAPI.
>>
>> Is there a solution to this problem ?
>>
>> Thanks a lot
>>
>> Pierrick
>>     
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>   


-- 
http://www.cse.ohio-state.edu/~surs



More information about the mvapich-discuss mailing list