[mvapich-discuss] MPI_{Send, Recv} Cuda buffer not actually synchronous?

Steven Eliuk s.eliuk at samsung.com
Fri Nov 14 16:30:15 EST 2014


Hi all,

We have noticed some strange behavior on MPI{Send, Recv} pair where the master sends data located in a host buffer to a slave’s GPU direct buffer. Now, initially we believed it was only in distributed multi-node fashion but have since narrowed it down to very simple case where everything resides on one node, e.g. Master, with two slaves.

Do  you have a more detailed change log from 2.0b-gdr -> 2.0 ? As 2.0 seems to fix the most basic test we can reproduce this in but we have more complicated tests that show the same behavior. We are hoping to track it down, seems as though you are posting a little earlier the sync recv has actually completed… when in fact it hasn’t.

Kindest Regards,
—
Steven Eliuk, Ph.D. Comp Sci,
Advanced Software Platforms Lab,
SRA - SV,
Samsung Electronics,
1732 North First Street,
San Jose, CA 95112,
Work: +1 408-652-1976,
Work: +1 408-544-5781 Wednesdays,
Cell: +1 408-819-4407.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141114/5c14bd76/attachment-0001.html>


More information about the mvapich-discuss mailing list