[mvapich-discuss] [PATCH] Enable blocking infiniband with dynamic connections

Maksym Planeta mplaneta at os.inf.tu-dresden.de
Wed Dec 16 11:10:03 EST 2015


Hi,

there used to be a bug that deadlocked the whole application if dynamic 
connections were used with blocking infiniband.

The reason for this bug (as I figured out) is following. Once a message 
is sent over a virtual connection it is checked whether there exist a 
queue pair for this connection. If there is none, then a queue pair is 
created and actual message transfer is postponed.

Consider an example of send-receive function MPIC_Sendrecv from 
mpi/coll/helper_fns.c. It consists of following operations:

1. MPIC_IRecv
2. MPIC_ISend
3. MPIC_Wait(send)
4. MPIC_Wait(recv)

It turns out that if there is no a QP for a VC, MPIC_Wait does not send 
a message. Instead the sender thread puts the message to a pending queue 
(vc->ch.cm_sendq_head). Communication creation is done by a connection 
manager thread. Once it creates a connection, it sets 
MPIDI_CH3I_Process.new_conn_complete to one indicating that a channel 
can send pending messages in the main thread.

The main thread polls in MPIDI_CH3I_Progress.new_conn_complete 
regularly. When the main thread detects the flag is set, it sends 
pending messages over the new channel and waits until message from 
another process arrives (step 4).

It works fine with polling. But in blocking mode the situation is 
different. The main thread does not poll variable new_conn_complete, 
instead it blocks in perform_blocking progress and never checks pending 
queue. This ends up in a deadlock, because no process can never receive 
a message because no other process ever sends it.

I propose a fix which immediately flushes pending queue from a CM 
thread, once connection is established. This eliminates a race condition.

In the attachment is a patch which fixes this race condition, but does 
not enable blocking with dynamic connection. In the follow up mail I 
describe the second bug fix, which enables blocking infiniband with 
dynamic connections.

I'd like to here from you about this fix. And I have to warn you that I 
tested it only with IB and UD. And not with iWARP and XRC.

-- 
Regards,
Maksym Planeta
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Handle-pending-message-immediately-after-VC-becomes-.patch
Type: text/x-patch
Size: 13262 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151216/a42b2d9d/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5154 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151216/a42b2d9d/attachment-0001.p7s>


More information about the mvapich-discuss mailing list