[mvapich-discuss] MPI_{Send, Recv} Cuda buffer not actually synchronous?

Akshay Venkatesh akshay.v.3.14 at gmail.com
Sat Nov 15 14:39:16 EST 2014


Hi Steven,

Would it possible to share a reproducer so that we can check if there's a
bug locally? A simple code snippet will suffice too.

Thanks
On Nov 14, 2014 11:08 PM, "Steven Eliuk" <s.eliuk at samsung.com> wrote:

>  Hi all,
>
>  We have noticed some strange behavior on MPI{Send, Recv} pair where the
> master sends data located in a host buffer to a slave’s GPU direct buffer.
> Now, initially we believed it was only in distributed multi-node fashion
> but have since narrowed it down to very simple case where everything
> resides on one node, e.g. Master, with two slaves.
>
>  Do  you have a more detailed change log from 2.0b-gdr -> 2.0 ? As 2.0
> seems to fix the most basic test we can reproduce this in but we have more
> complicated tests that show the same behavior. We are hoping to track it
> down, seems as though you are posting a little earlier the sync recv has
> actually completed… when in fact it hasn’t.
>
>  Kindest Regards,
>>
> *Steven Eliuk, Ph.D. Comp Sci,*
>
> *Advanced Software Platforms Lab,*
>
> *SRA - SV,*
>
> *Samsung Electronics,*
>
> *1732 North First Street,*
>
> *San Jose, CA 95112,*
>
> *Work: +1 408-652-1976 <%2B1%20408-652-1976>,*
>
> *Work: +1 408-544-5781 <%2B1%20408-544-5781> Wednesdays,*
>
> *Cell: +1 408-819-4407 <%2B1%20408-819-4407>.*
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141115/7a5af727/attachment.html>


More information about the mvapich-discuss mailing list