[mvapich-discuss] FATAL event IBV_EVENT_QP_LAST_WQE_REACHED
Sayantan Sur
surs at cse.ohio-state.edu
Sat Aug 4 15:27:21 EDT 2007
Hi Nathan,
Nathan Dauchy wrote:
> Pierrick, All,
>
> We recently upgraded to OFED-1.2 (mvapich-0.9.9) and are now getting an
> error that looks similar to yours:
>
> [0:w72] Abort: [0] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED,
> code=16 at line 2552 in file viacheck.c
>
> Did you ever find a solution? (I can't find one in the archives.)
>
> Can someone explain what the IBV_EVENT_QP_LAST_WQE_REACHED error means?
> I can't find any clues in the source, and have been unable to turn up
> any relevant docs either.
>
> Thanks for any help and clues you can offer!
>
Thanks for reporting the problem. The event
IBV_EVENT_QP_LAST_WQE_REACHED means that the QP (internal InfiniBand
communication channel) is in an error state and all requests are
consumed. Could it be related to a setup issue? Can you run any other
MPI programs such as OSU benchmarks, IMB etc. on all these nodes?
Thanks,
Sayantan.
> Regards,
> Nathan
>
>
> Pierrick Penven, Tue Mar 27 12:33:03 EDT 2007:
>
>> Dear all,
>>
>> I am trying to install an ocean model on a cluster based on 64 bit bi-dual
>> core AMD opterons with infiniband using pathf90 and mvapich v0.9.8.
>>
>> The model is runs and scales well on 1 node, but is not able to run on several
>> nodes. I have tried to used mvapich v0.9.9.beta, and I get the following
>> message:
>>
>> [1:chpcc060] Abort: [1:chpcc060] Abort: [chpcc060:1] Got completion with error
>> IBV_WC_LOC_PROT_ERR, code=4 at line 2374 in file viacheck.c
>> [1] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16 at line 2554 in
>> file viacheck.c
>> [0:chpcc058] Abort: [0] Got FATAL event IBV_EVENT_QP_LAST_WQE_REACHED, code=16
>> at line 2554 in file viacheck.c
>> /CHPC/usr/local/mvapich_099b/bin/mpirun: line 1: 17412
>> Terminated /CHPC/usr/local/mvapich_099b/bin/mpirun_rsh -np
>> 2 -hostfile /CHPC/home/loadl/execute/chpcln.4269.0.machinefile /CHPC/home/ppenven/Roms_tools/TEST1/./roms
>>
>> The problem does not occur using the MPI over IP rather than VAPI.
>>
>> Is there a solution to this problem ?
>>
>> Thanks a lot
>>
>> Pierrick
>>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
--
http://www.cse.ohio-state.edu/~surs
More information about the mvapich-discuss
mailing list