[mvapich-discuss] [cuda_stage_free] cudaMemcpy failed with 11 at 1564

Wed Nov 7 06:27:50 EST 2012

Hi,

mvapich2-trunk-r5884 solved the issue. No more errors when MV2_USE_CUDA is set to 1 for the code which calls MPI functions with host buffers only.

Thanks a lot!

Best regards,
Maxim Milakov

From: sreeram.chowdary at gmail.com [mailto:sreeram.chowdary at gmail.com] On Behalf Of sreeram potluri
Sent: Tuesday, November 06, 2012 9:44 PM
To: Maxim Milakov
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] [cuda_stage_free] cudaMemcpy failed with 11 at 1564

Hi,

This issue has been resolved and is available as part of our nightly tarballs for MVAPICH2 1.9 trunk. Let us know if this helps.

http://mvapich.cse.ohio-state.edu/nightly/mvapich2/trunk/

Best
Sreeram Potluri

On Tue, Nov 6, 2012 at 7:32 AM, Maxim Milakov <mmilakov at nvidia.com<mailto:mmilakov at nvidia.com>> wrote:
Hi all,

I am building and running code with MVAPICH 1.9a. The code is built with no CUDA support - it is not using CUDA at all.

Still MVAPICH2 is configured with CUDA support:

CPPFLAGS="-D x86_64 -D__align__\(n\)=__attribute__\(\(aligned\(n\)\)\) -D__location__\(a\)=__annotate__\(a\) -DCUDARTAPI=" F77=pgf77 FC=pgf90 CC=pgcc CXX=pgcpp ./configure --prefix=/usr/local/pgi_specific/mvapich2 --enable-f77 --enable-fc --enable-cxx --enable-cuda --with-cuda=$CUDA_INSTALL_PATH --disable-mcast

Whenever I run the code with no specific options I get no errors, the code runs fine, showing significant performance improvement on configurations with IB (in comparison with MPICH).

If I set MV2_USE_CUDA=1 (I tried it as the next step would be to run the version of the code ported to GPU) then the code aborts with the following error message:

[wm040:mpi_rank_0][cuda_stage_free] cudaMemcpy failed with 11 at 1564
[wm040:mpi_rank_1][cuda_stage_free] cudaMemcpy failed with 11 at 1564

It seems that the MVAPICH2 falsely detects that the pointer passed to mpi_send is referencing GPU memory.

I am using CUDA 5.0, still I experience the error when I use CUDA 4.2 (with new driver from installed with CUDA 5.0).
The error occurs even if the code runs just a single rank on each node.

I found similar issue here, with no resolution: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2012-August/004014.html

Thank you.

Best regards,
Maxim Milakov

________________________________
This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
________________________________

_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>
http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20121107/123fa2a1/attachment-0001.html