[mvapich-discuss] mvapich2-1.8ap cuda osu bandwidth test

Yaoming Mu yaomingmu at gmail.com
Tue Jan 24 10:21:11 EST 2012


I installed mvapich2-1.8ap with cuda support, but bandwidth test
results were very bad , compared to results published by mvapich2 team

# OSU MPI-CUDA Bandwidth Test
# Send Buffer on HOST (H) and Receive Buffer on DEVICE (D)
# Size        Bandwidth (MB/s)
1                         0.12
2                         0.25
4                         0.57
8                         1.09
16                        2.19
32                        4.42
64                        8.96
128                      17.78
256                      35.42
512                      69.32
1024                    132.06
2048                    243.37
4096                    408.74
8192                    557.47
16384                   479.83
32768                   542.55
65536                   576.15
131072                  598.19
262144                  606.11
524288                  611.05
1048576                 611.86
2097152                 610.60
4194304                 607.65


 even Host to Host is so bad if I export MV2_USE_CUDA=1

# OSU MPI-CUDA Bandwidth Test
# Send Buffer on HOST (H) and Receive Buffer on HOST (H)
# Size        Bandwidth (MB/s)
1                         1.04
2                         2.67
4                         5.34
8                        10.52
16                       21.01
32                       41.53
64                       81.90
128                     140.18
256                     159.54
512                     271.31
1024                    389.32
2048                    485.95
4096                    557.83
8192                    585.33
16384                   493.49
32768                   553.46
65536                   589.91
131072                  608.17
262144                  619.25
524288                  622.58
1048576                 624.03
2097152                 624.59
4194304                 623.50

But if disable cuda by export MV2_USE_CUDA=0, I got normal results
# OSU MPI-CUDA Bandwidth Test
# Send Buffer on HOST (H) and Receive Buffer on HOST (H)
# Size        Bandwidth (MB/s)
1                         4.12
2                         8.42
4                        16.22
8                        33.31
16                       70.34
32                      122.98
64                      262.41
128                     503.14
256                     957.10
512                    1657.49
1024                   2806.88
2048                   5126.41
4096                   7364.44
8192                  10301.80
16384                 10391.13
32768                  9811.05
65536                 11492.53
131072                11793.38
262144                10714.53
524288                10282.15
1048576               10413.02
2097152               10124.43
4194304                5963.46

My cuda version is 4.1rc2


More information about the mvapich-discuss mailing list