[mvapich-discuss] Infiniband-less Single Node with Multiple GPUs

Brody Huval brodyh at stanford.edu
Wed Aug 29 21:54:03 EDT 2012


Hi,

I am trying to set up MVAPICH2 on a single node with 8 GPUs and without infiniband. I tried testing it out with the osu micro benchmarks but am currently getting an error. I configured and ran as follows:


brodyh at watts0:/scr/brodyh/local/libexec/osu-micro-benchmarks$ mpiname -a
MVAPICH2 1.8 Mon Apr 30 14:56:40 EDT 2012 ch3:mrail

Compilation
CC: gcc    -DNDEBUG -DNVALGRIND -O2
CXX: c++   -DNDEBUG -DNVALGRIND -O2
F77: gfortran   -O2 
FC: gfortran   -O2

Configuration
--prefix=/scr/brodyh/local --enable-cuda --with-cuda=/usr/local/cuda



brodyh at watts0:/scr/brodyh/local/libexec/osu-micro-benchmarks$ mpirun_rsh -np 2 watts0 watts0 MV2_USE_CUDA=1 MV2_USE_SHARED_MEM=1 MV2_SMP_SEND_BUF_SIZE=262144 get_local_rank ./osu_bw D D
[watts0.Stanford.EDU:mpi_rank_1][cuda_stage_free] cudaMemcpy failed with 11 at 1261
[watts0.Stanford.EDU:mpi_rank_0][cuda_stage_free] cudaMemcpy failed with 11 at 1261
[watts0.Stanford.EDU:mpispawn_0][readline] Unexpected End-Of-File on file descriptor 5. MPI process died?
[watts0.Stanford.EDU:mpispawn_0][mtpmi_processops] Error while reading PMI socket. MPI process died?
[watts0.Stanford.EDU:mpispawn_0][child_handler] MPI process (rank: 1, pid: 7480) exited with status 255
[watts0.Stanford.EDU:mpispawn_0][child_handler] MPI process (rank: 0, pid: 7479) exited with status 255




Any idea what could be causing this? Thank you very much for your time.


Best,
Brody Huval
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120829/0cc5ff75/attachment.html


More information about the mvapich-discuss mailing list