[mvapich-discuss] CUDA-aware MVAPICH with persistent communicators
Kate Clark
mclark at nvidia.com
Sat Sep 23 12:42:33 EDT 2017
Hi,
I’ve been doing testing of MVAPICH with CUDA-awareness enabled, for P2P within a node and for GPU Direct RDMA exchange between nodes. What I am finding is that MVAPICH appears to be poorly optimized for persistent communicators, e.g., MPI_Send_init / MPI_Recv_init / MPI_Start / MPI_Wait when used with two GPUs.
A couple of immediate issues worth mentioning:
- There appears to be no optimization for persistent communicators. For example, my application uses the same handles 1000s of times over, but for every call to MPI_Start, a query of the pointer location is initiated, e.g., cuGetPointerAttribute is called repeatedly. This adds a noticeable latency for both CPU and GPU messages, e.g., around 0.7 us of API overhead. My application is bound by CPU CUDA API latency and removing any unneeded API calls is highly desirable. For example, see the below trace taken from profiling my application in nvprof, every call cuPointerGetAttribute is coming from MVAPICH.
API calls: 72.60% 146.888s 2.35e+08 624ns 333ns 57.367ms cudaEventQuery
11.45% 23.1676s 2000002 11.583us 4.2710us 275.34ms cudaMemcpy2DAsync
4.95% 10.0224s 5000011 2.0040us 419ns 32.029ms cudaEventRecord
4.65% 9.40582s 14001154 671ns 211ns 28.149ms cuPointerGetAttribute
4.09% 8.27296s 4000004 2.0680us 464ns 28.110ms cudaStreamWaitEvent
- Comparing the performance between persistent message handles and MPI_Isend / MPI_Irecv shows that persistent message handles can actually be slower than using regular message handles. This is true for both peer-to-peer exchange as well as using GPU Direct RDMA. This suggests that underneath the hood, persistent message handles are not simply falling back to using MPI_Isend / MPI_Irecv and some other overhead is being introduced. This is surprising, since in principle, persistent message handles should have the lowest latency and maximal performance compared to using regular exchange since all setup overheads can be amortized.
Are there any plans to better optimize persistent message handles in MVAPICH?
Thanks,
Kate.
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information. Any unauthorized review, use, disclosure or distribution
is prohibited. If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170923/f35177f6/attachment-0001.html>
More information about the mvapich-discuss
mailing list