[Mvapich-discuss] mlx5 IBV_WC_LOC_QP_OP_ERR

Lana Deere lana.deere at gmail.com
Fri Jan 22 12:02:20 EST 2021


I've been running the same program with the same dataset a bunch of times
in order to try to reproduce a different issue.  One of my runs failed with
the message included below.  I'm using mvapich2 2.3.5-1.  The vendor code
0x68 it references is " malformed WQE (Work Queue Element)".  Anyone have
any ideas on the cause of this?  I'm not sure how repeatable this will turn
out to be.

mlx5: worker15.local: got completion with error:
00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000
00000006 00000000 00000000 00000000
00000000 12006802 000022c5 06746ed2
[worker15.local:mpi_rank_7][handle_cqe] Send desc error in msg to 7,
wc_opcode=0
[worker15.local:mpi_rank_7][handle_cqe] Msg from 7: wc.status=2 (local QP
operation error), wc.wr_id=0xe4eaa50, wc.opcode=0, vbuf->phead->type=2 =
MPIDI_CH3_PKT_FAST_EAGER_SEND
[worker15.local:mpi_rank_7][mv2_print_wc_status_error]
IBV_WC_LOC_QP_OP_ERR: This event is generated when a QP error occurs. For
example, it may be generated if a) user neglects to specify
responder_resources and initiator_depth values in struct rdma_conn_param
before calling rdma_connect() on the client side and rdma_accept() on the
server side, b) a Work Request that was posted in a local Send Queue of a
UD QP contains an Address Handle that is associated with a Protection
Domain to a QP which is associated with a different Protection Domain, or
c) an opcode which is not supported by the transport type of the QP is not
supported (for example: RDMA Write over a UD QP).
[worker15.local:mpi_rank_7][handle_cqe]
src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:499: [] Got
completion with error 2, vendor code=0x68, dest rank=7

.. Lana (lana.deere at gmail.com)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20210122/77d69238/attachment-0021.html>


More information about the Mvapich-discuss mailing list