[mvapich-discuss] FATAL event IBV_EVENT_QP_LAST_WQE_REACHED

Nathan Dauchy Nathan.Dauchy at noaa.gov
Tue Aug 21 15:40:45 EDT 2007


Updated...

Nathan Dauchy wrote:
> I finally had time to get back to this issue...
> 
> The OSU benchmarks run fine.
> The presta benchmarks run fine.
> I'm getting a segfault with IMB.  Running it under gdb, I grabbed the
> following backtrace:
> 
> (gdb) bt
> #0  0x000000000044a846 in movdqa8 ()
> #1  0x00000000004490b6 in _intel_fast_memcpy.J ()
> #2  0x000000000043c03f in smpi_recv_get ()
> #3  0x000000000043a03b in smpi_net_lookup ()
> #4  0x0000000000439d61 in MPID_SMP_Check_incoming ()
> #5  0x000000000042aff8 in viutil_spinandwaitcq ()
> #6  0x000000000042a426 in MPID_DeviceCheck ()
> #7  0x000000000043334f in MPID_RecvComplete ()
> #8  0x00000000004369c5 in MPID_RecvDatatype ()
> #9  0x000000000040eefe in PMPI_Recv ()
> #10 0x0000000000407880 in IMB_pingpong (c_info=0x60d790, size=-1765560296,
>     n_sample=-1765558112, RUN_MODE=0x60e000, time=0x1770)
>     at IMB_pingpong.c:180
> #11 0x000000000040636e in IMB_warm_up (c_info=0x60d790, Bmark=0x2a96c3b018,
>     iter=-1765558112) at IMB_warm_up.c:127
> #12 0x000000000040393f in main (argc=1, argv=0x7fbfffe508) at IMB.c:262
> 
> It doesn't look to me like it is actually related to the
> IBV_EVENT_QP_LAST_WQE_REACHED error, but I'm sure others on this list
> can tell better than I can.  Does the IMB segfault point to anything in
> particular?

IMB now runs fine.  I had the library path wrong when compiling.

>> Sayantan Sur wrote:
>>> Thanks for reporting the problem. The event
>>> IBV_EVENT_QP_LAST_WQE_REACHED means that the QP (internal InfiniBand
>>> communication channel) is in an error state and all requests are
>>> consumed. Could it be related to a setup issue? Can you run any other
>>> MPI programs such as OSU benchmarks, IMB etc. on all these nodes?
>>>

Any other ideas?

Thanks,
Nathan



More information about the mvapich-discuss mailing list