[mvapich-discuss] BUG REPORT: MVAPICH2 over OFED 1.5.4.1 fails in heterogeneous fabrics

Heinz, Michael William michael.william.heinz at intel.com
Mon Apr 2 14:13:17 EDT 2012


Basically, the problem is this: In version 1.7 of mvapich2, setting up handling of a mixed fabric was done before initialization of the IB queue pairs. This was done by calling rdma_ring_based_allgather() to collect information about the HCA types and then calling rdma_param_handle_heterogenity(). (See lines 250-270 of rdma_iba_init.c).

Working this way permitted each rank to correctly determine whether to create a shared receive queue or not.

Unfortunately, this was eliminated in 1.7-r5140. In the new version, rdma_param_handle_heterogenity() is not called till *after* the shared receive queue has already been created and the QP had been moved to ready-to-receive state - and when rdma_param_handle_heterogenity() turns the shared receive queue off, the queue pairs are left in an unusable state.

This problem affects fabrics using HCAs from IBM, older Tavor-style Mellanox HCAs and QLogic HCAs.

We've reviewed the changes and, unfortunately, we can't see a way to fix this without going back to using rdma_ring_based_allgather() to collect information about the HCA types before initializing the queue pairs. The work around is to manually specify MV2_USE_SRQ=0 when using mvapich2-1.7-r5140.



More information about the mvapich-discuss mailing list