[mvapich-discuss] Fw: run mvapich for GPU communication
li.luo at siat.ac.cn
li.luo at siat.ac.cn
Mon Aug 12 22:43:58 EDT 2013
To make more details,
I use cuda5.0, the latest mvapich1.9
configure by ./configure --prefix=/opt/mvapich2-1.9-gnu --enable-shared --enable-cuda --with-cuda=/home/liluo/lib/cuda_5.0 --disable-mcast
It runs well for osu_alltoallv :
[liluo at gpu2 osu_benchmarks]$ mpirun_rsh -np 2 gpu1-ib gpu2-ib MV2_USE_CUDA=1 get_local_rank ./osu_alltoallv D D
# OSU MPI All-to-Allv Personalized Exchange Latency Test
# Size Avg Latency(us)
1 4.56
2 4.63
4 4.59
8 4.58
16 4.58
32 4.66
64 6.36
128 6.66
256 7.24
512 7.98
1024 9.53
2048 12.53
4096 17.48
8192 26.35
16384 43.49
32768 85.12
65536 140.24
131072 250.05
262144 483.49
524288 932.46
1048576 1866.31
A related issue can be found at http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2012-November/004119.html
It seems my cudaMemcpy failure is related to buffer detection.
Please see my problem in my last email below.
Thanks.
-----Original Messages-----
From: li.luo at siat.ac.cn
Sent Time: Tuesday, August 13, 2013
To: mvapich at cse.ohio-state.edu
Cc:
Subject: run mvapich for GPU communication
Hi,
I want to use MPI_Alltoallv to communicate with 2 GPU cards by running:
mpirun_rsh -np 2 -hostfile hosts MV2_USE_CUDA=1 ./ex1 ...
It 's weird that when I use nvcc with debug options such as -g -G to compile, the program runs right.
But if I use nvcc with -O to compile, then it fails and returns
[gpu2:mpi_rank_1][cuda_stage_alloc_v] cudaMemcpy failed with 4 at 2020
[gpu1:mpi_rank_0][cuda_stage_alloc_v] cudaMemcpy failed with 4 at 2020
[gpu2:mpispawn_1][readline] Unexpected End-Of-File on file descriptor 5. MPI process died?
[gpu2:mpispawn_1][mtpmi_processops] Error while reading PMI socket. MPI process died?
[gpu2:mpispawn_1][child_handler] MPI process (rank: 1, pid: 16996) exited with status 1
[gpu1:mpispawn_0][readline] Unexpected End-Of-File on file descriptor 5. MPI process died?
[gpu1:mpispawn_0][mtpmi_processops] Error while reading PMI socket. MPI process died?
make: *** [runex1_mvapich] Error 1
--
Li Luo
Shenzhen Institutes of Advanced Technology
Address: 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, P.R.China
Tel: +86-755-86392312£¬+86-15899753087
Email: li.luo at siat.ac.cn
--
Li Luo
Shenzhen Institutes of Advanced Technology
Address: 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, P.R.China
Tel: +86-755-86392312£¬+86-15899753087
Email: li.luo at siat.ac.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130813/8940f786/attachment-0001.html
More information about the mvapich-discuss
mailing list