[mvapich-discuss] CUDA running issue in MVAPICH2

randonlang at gmail.com randonlang at gmail.com
Thu Apr 9 12:11:59 EDT 2015


Thx,  Jonathan, it works! and thanks khaled too.
sorry for bother again :p
but I got some weird output, D to D is far more slower than H to H when transfer small data, even D to H

here is the benchmark result:

# OSU MPI-CUDA Latency Test 
# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D) 
# Size Latency (us) 
1 63.42 
2 63.02 
4 61.95 
8 61.96 
16 61.87 
32 61.95 
64 61.92 
128 61.94 
256 61.97 
512 61.98 
1024 62.06 
2048 62.05 
4096 62.12 
8192 62.15 
16384 74.19 
32768 74.25 
65536 75.24 
131072 82.66 
262144 81.32 
524288 85.70 
1048576 121.99 
2097152 272.36 
4194304 585.34 

# OSU MPI-CUDA Latency Test 
# Send Buffer on HOST (H) and Receive Buffer on HOST (H) 
# Size Latency (us) 
1 0.92 
2 0.91 
4 0.91 
8 0.92 
16 0.91 
32 0.93 
64 0.99 
128 0.96 
256 1.03 
512 1.11 
1024 1.20 
2048 1.39 
4096 1.78 
8192 2.74 
16384 5.31 
32768 7.32 
65536 8.00 
131072 13.95 
262144 29.38 
524288 57.95 
1048576 115.65 
2097152 226.63 
4194304 571.31 
 
# OSU MPI-CUDA Latency Test 
# Send Buffer on HOST (H) and Receive Buffer on DEVICE (D) 
# Size Latency (us) 
1 9.59 
2 9.73 
4 9.56 
8 9.66 
16 9.83 
32 9.63 
64 9.75 
128 8.57 
256 8.42 
512 8.87 
1024 8.62 
2048 8.79 
4096 9.34 
8192 10.37 
16384 12.40 
32768 19.03 
65536 21.84 
131072 35.24 
262144 66.08 
524288 110.40 
1048576 207.23 
2097152 354.09 
4194304 669.29 


From: Jonathan Perkins
Date: 2015-04-09 21:40
To: Dun Liang; mvapich-discuss
Subject: Re: [mvapich-discuss] CUDA running issue in MVAPICH2
Hi Dun, can you try setting MV2_USE_CUDA=1 when you run the benchmarks with the device buffers?

Example:
mpirun_rsh -np 2 debian81 debian81 MV2_USE_CUDA=1 ./osu_latency D D

On Thu, Apr 9, 2015 at 8:54 AM Dun Liang <randonlang at gmail.com> wrote:
Dear developers:

currently I have some problems running mvapich with cuda, 
the program is osu_latency
here is the error msg:
```
┌─[liangdun at debian81] - [~/mvapich/mvapich2-2.1rc2_ib/mvapich2-2.1rc2/osu_benchmarks/.libs] - [2015-04-09 06:17:20]
└─[1] <> mpirun_rsh -np 2 debian81 debian81 ./osu_latency D D
# OSU MPI-CUDA Latency Test
# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
# Size            Latency (us)
[debian81:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
[debian81:mpispawn_0][readline] Unexpected End-Of-File on file descriptor 6. MPI process died?
[debian81:mpispawn_0][mtpmi_processops] Error while reading PMI socket. MPI process died?
[debian81:mpispawn_0][child_handler] MPI process (rank: 0, pid: 1376) terminated with signal 11 -> abort job
[debian81:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node debian81 aborted: Error while reading a PMI socket (4)

```
it works fine when I run `./osu_latency H H`
```
┌─[liangdun at debian81] - [~/mvapich/mvapich2-2.1rc2_ib/mvapich2-2.1rc2/osu_benchmarks/.libs] - [2015-04-09 06:17:41]
└─[1] <> mpirun_rsh -np 2 debian81 debian81 ./osu_latency H H
# OSU MPI-CUDA Latency Test
# Send Buffer on HOST (H) and Receive Buffer on HOST (H)
# Size            Latency (us)
1                         0.28
2                         0.27
4                         0.27
8                         0.29
16                        0.27
32                        0.28
64                        0.31
128                       0.33
256                       0.39
512                       0.46
1024                      0.56
2048                      0.75
4096                      1.24
8192                      1.99
16384                     3.71
32768                     6.49
65536                     6.96
131072                   12.95
262144                   27.73
524288                   56.53
1048576                 113.61
2097152                 226.53
4194304                 628.29

```

here is my mpi version info:
```
MVAPICH2 Version:       2.1rc2
MVAPICH2 Release date:  Thu Mar 12 20:00:00 EDT 2014
MVAPICH2 Device:        ch3:mrail
MVAPICH2 configure:     --prefix=/home/liangdun/mvapich/build --enable-cuda --disable-mcast --with-cuda=/usr/local/cuda --with-device=ch3:mrail
MVAPICH2 CC:    gcc    -DNDEBUG -DNVALGRIND -O2
MVAPICH2 CXX:   g++   -DNDEBUG -DNVALGRIND -O2
MVAPICH2 F77:   gfortran -L/lib -L/lib   -O2
MVAPICH2 FC:    gfortran   -O2
```
the special circumstance is there is no infiniband installed in my computer, but I have to test cuda, I find out --enable-cuda config doesnt work when I using --with-device=ch3:sock .

here are my questions:
* is this cuda error caused by no infiniband installation?
* is there any way to test cuda with tcp/ip setup?

sorry for my poor English, I appreciate MVAPICH's work!

best regards!

Dun
_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu
http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150410/3552cd93/attachment.html>


More information about the mvapich-discuss mailing list