[mvapich-discuss] MVAPICH 2 Progress Code improvement for RDMA_FAST_PATH

wei huang huanwei at cse.ohio-state.edu
Sun Mar 25 14:39:08 EDT 2007


Hi Sylvain,

Thanks for your effort on helping us improving rdma fast path code. Your
proposal looks good to us. There may be some corner cases in the progress
engine that need to be considered, but we should be able to take care of
them later. We are actually working on a later 1.0 release, which will
have more features including enhanced messaging rate, enhanced
collectives, etc. Now it should be the right time to incorperate such
enhancement. We will thus have time to systematically test and evaluate
the changes. A patch from you will definitely help us moving faster along
this direction. A patch against 0.9.8 should work fine.

Thanks again and looking forward to discussing with you further.

-- Wei

> [ADAPTIVE_]RDMA_FAST_PATH is an optimization to provide low latency on
> mvapich2. The issue is, latency increases as the number of total processes
> grows. Finally, when you launch a job with over 32 processes, latency is
> worse than the standard send/recv protocol.
>
> The reason for that is very simple. Contrary to the send/recv protocol
> which gets its receives in a single completion queue, the RDMA fast path
> has to poll _every_ RDMA queue to find out from which queue to receive
> data.
>
> My first try to improve that was to poll only on the VCs associated to
> requests passed to MPID_Progress. That didn't work well because
> unfortunately, well-written MPI applications are scarce, and calling
> MPI_Wait on the wrong request resulted in a deadlock.
>
> My second try is a lot better. The RDMA polling set is now restrained to :
>   * VCs on which we have waiting posted receives;
>   * VCs on which we have a rendez-vous send in progress.
> .. and it seems to work fine and quickly, since polling is quite always
> directed to the right VC.
>
> Has anyone already a good (better) solution for that ? Am I totally
> mistaken in my understanding of the MVAPICH 2 code ? If I'm not, I will
> consider cleaning things and proposing a patch against 0.9.8, unless I
> should wait until 0.9.9 ?
>
> Thanks in advance for your opinions/comments/flames on that,
>
> Sylvain
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list