[mvapich-discuss] [PATCH] Enable blocking infiniband with dynamic connections

Maksym Planeta mplaneta at os.inf.tu-dresden.de
Fri Dec 18 12:04:54 EST 2015


Hello Hari,

if you could share a case when deadlock is still there, probably I could 
help you. Because I have feeling that I understand the code pretty well 
already :)

And could you explain what are the correctness issues?

On 12/18/2015 05:22 PM, Hari Subramoni wrote:
> Hello Maksym,
>
> Thanks for the investigation and the patch. We appreciate it. We are
> also aware of the issue and have been trying to fix it :-).
>
> We went over the patch. Unfortunately, it does not seem to fix all
> possible hangs. Further, the patch can potentially lead to correctness
> issues when MVAPICH2 is running in a multi-threaded scenario.
>
> Regards,
> Hari.
>
> On Wed, Dec 16, 2015 at 11:10 AM, Maksym Planeta
> <mplaneta at os.inf.tu-dresden.de <mailto:mplaneta at os.inf.tu-dresden.de>>
> wrote:
>
>     Hi,
>
>     there used to be a bug that deadlocked the whole application if
>     dynamic connections were used with blocking infiniband.
>
>     The reason for this bug (as I figured out) is following. Once a
>     message is sent over a virtual connection it is checked whether
>     there exist a queue pair for this connection. If there is none, then
>     a queue pair is created and actual message transfer is postponed.
>
>     Consider an example of send-receive function MPIC_Sendrecv from
>     mpi/coll/helper_fns.c. It consists of following operations:
>
>     1. MPIC_IRecv
>     2. MPIC_ISend
>     3. MPIC_Wait(send)
>     4. MPIC_Wait(recv)
>
>     It turns out that if there is no a QP for a VC, MPIC_Wait does not
>     send a message. Instead the sender thread puts the message to a
>     pending queue (vc->ch.cm_sendq_head). Communication creation is done
>     by a connection manager thread. Once it creates a connection, it
>     sets MPIDI_CH3I_Process.new_conn_complete to one indicating that a
>     channel can send pending messages in the main thread.
>
>     The main thread polls in MPIDI_CH3I_Progress.new_conn_complete
>     regularly. When the main thread detects the flag is set, it sends
>     pending messages over the new channel and waits until message from
>     another process arrives (step 4).
>
>     It works fine with polling. But in blocking mode the situation is
>     different. The main thread does not poll variable new_conn_complete,
>     instead it blocks in perform_blocking progress and never checks
>     pending queue. This ends up in a deadlock, because no process can
>     never receive a message because no other process ever sends it.
>
>     I propose a fix which immediately flushes pending queue from a CM
>     thread, once connection is established. This eliminates a race
>     condition.
>
>     In the attachment is a patch which fixes this race condition, but
>     does not enable blocking with dynamic connection. In the follow up
>     mail I describe the second bug fix, which enables blocking
>     infiniband with dynamic connections.
>
>     I'd like to here from you about this fix. And I have to warn you
>     that I tested it only with IB and UD. And not with iWARP and XRC.
>
>     --
>     Regards,
>     Maksym Planeta
>
>     _______________________________________________
>     mvapich-discuss mailing list
>     mvapich-discuss at cse.ohio-state.edu
>     <mailto:mvapich-discuss at cse.ohio-state.edu>
>     http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>

-- 
Regards,
Maksym Planeta

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5154 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151218/e64a01ae/attachment.p7s>


More information about the mvapich-discuss mailing list