[mvapich-discuss] issue with isend/recv to/from self with GPU memory

Benson, Tom benson31 at llnl.gov
Thu Jun 21 16:41:54 EDT 2018


Hi,

I have encountered an issue with recv-ing from self when the send buffer or the recv buffer reside on a GPU. I realize this is a rather pathological degenerate case, but it is supported in the CPU-only case.

Essentially MPIDI_CH3U_RecvFromSelf() does not handle the case that the pointers refer to device memory and the pointers land in a call to memcpy (good ol’ libc’s memcpy), which, not unexpectedly, fails fairly catastrophically.

A standalone reproducer program is attached. Compile with something like (sorry about the c++11; not critical to anything in this program, I’m just lazy):

nvcc -ccbin=mpicxx -std=c++11 -g -O0 -DDO_CPU_TO_GPU cuda_mpi_isend_recv_samerank.cpp

Run with something like:

MV2_USE_CUDA=1 ./a.out

Example output with backtrace attached as output.txt.

Other details:

The only configuration flags I’m using are “--enable-cuda --with-cuda=/path/to/cuda-9.1 --enable-g=all --enable-fast=none”. MVAPICH2 versions tested were 2.3rc2 and trunk. Nodes are standard linux, something in the RHEL family.

Let me know if I can provide any additional details.

Cheers,
Tom

--

Tom Benson, Ph.D.
Computer Scientist
Lawrence Livermore National Laboratory
benson31 at llnl.gov

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20180621/04047592/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cuda_mpi_isend_recv_samerank.cpp
Type: application/octet-stream
Size: 4559 bytes
Desc: cuda_mpi_isend_recv_samerank.cpp
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20180621/04047592/attachment-0001.obj>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: output.txt
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20180621/04047592/attachment-0001.txt>


More information about the mvapich-discuss mailing list