[mvapich-discuss] [PATCH] Enable blocking infiniband with dynamic connections

Maksym Planeta mplaneta at os.inf.tu-dresden.de
Wed Dec 16 11:21:09 EST 2015


Hi,

this is the second patch which fixes dynamic connections with blocking IB.

It solves the problem that ibv_get_cq_event does not return in 
perform_blocking_progress_for_ib. The problem as I see it is following. 
Once the main thread returns from the ibv_get_cq_event for the first 
time there is a time window between ibv_get_cq_event and 
ibv_req_notify_cq. If a message arrives in this time frame, IB never 
sends a notification for it, so that ibv_get_cq_event does not return 
again. I moved ibv_req_notify_cq earlier. Now, if there is an event 
pending, it is caught by ibv_poll_cq. If not then this event did not 
exist before ibv_poll_cq returns, thus ibv_req_notify_cq is guaranteed 
to request the very next event.

Again I have to warn you that I tested this only with IB and UD, but not 
with iWARP and XRC.

I used this command for test:

MV2_USE_SHARED_MEM=1 MV2_RDMA_NUM_EXTRA_POLLS=1 MV2_SPIN_COUNT=1 
MV2_ON_DEMAND_THRESHOLD=1 MV2_USE_BLOCKING=1 MV2_ENABLE_AFFINITY=0 
mpirun -n 25    -prepend-rank   -hostfile host ./hello_world

I disable affinity, so that all the processes run on the same core. I 
tried up to 500 processes.

This program finishes much faster with dynamic connections, than without.

A will be glad to here an opinion from you, if this is an appropriate fix.

-- 
Regards,
Maksym Planeta
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Fix-race-condition-in-blocking-progress-for-ib.patch
Type: text/x-patch
Size: 4924 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151216/faffb399/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5154 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151216/faffb399/attachment.p7s>


More information about the mvapich-discuss mailing list