[mvapich-discuss] FATAL event IBV_EVENT_QP_LAST_WQE_REACHED
Nathan Dauchy
Nathan.Dauchy at noaa.gov
Tue Aug 21 15:40:45 EDT 2007
Updated...
Nathan Dauchy wrote:
> I finally had time to get back to this issue...
>
> The OSU benchmarks run fine.
> The presta benchmarks run fine.
> I'm getting a segfault with IMB. Running it under gdb, I grabbed the
> following backtrace:
>
> (gdb) bt
> #0 0x000000000044a846 in movdqa8 ()
> #1 0x00000000004490b6 in _intel_fast_memcpy.J ()
> #2 0x000000000043c03f in smpi_recv_get ()
> #3 0x000000000043a03b in smpi_net_lookup ()
> #4 0x0000000000439d61 in MPID_SMP_Check_incoming ()
> #5 0x000000000042aff8 in viutil_spinandwaitcq ()
> #6 0x000000000042a426 in MPID_DeviceCheck ()
> #7 0x000000000043334f in MPID_RecvComplete ()
> #8 0x00000000004369c5 in MPID_RecvDatatype ()
> #9 0x000000000040eefe in PMPI_Recv ()
> #10 0x0000000000407880 in IMB_pingpong (c_info=0x60d790, size=-1765560296,
> n_sample=-1765558112, RUN_MODE=0x60e000, time=0x1770)
> at IMB_pingpong.c:180
> #11 0x000000000040636e in IMB_warm_up (c_info=0x60d790, Bmark=0x2a96c3b018,
> iter=-1765558112) at IMB_warm_up.c:127
> #12 0x000000000040393f in main (argc=1, argv=0x7fbfffe508) at IMB.c:262
>
> It doesn't look to me like it is actually related to the
> IBV_EVENT_QP_LAST_WQE_REACHED error, but I'm sure others on this list
> can tell better than I can. Does the IMB segfault point to anything in
> particular?
IMB now runs fine. I had the library path wrong when compiling.
>> Sayantan Sur wrote:
>>> Thanks for reporting the problem. The event
>>> IBV_EVENT_QP_LAST_WQE_REACHED means that the QP (internal InfiniBand
>>> communication channel) is in an error state and all requests are
>>> consumed. Could it be related to a setup issue? Can you run any other
>>> MPI programs such as OSU benchmarks, IMB etc. on all these nodes?
>>>
Any other ideas?
Thanks,
Nathan
More information about the mvapich-discuss
mailing list