[mvapich-discuss] intra node communication between GPUs
Ye Wang
wang1351 at purdue.edu
Mon Jun 10 10:30:52 EDT 2013
Hi,
I am using mvapich2-1.9 to do communications between GPUs. When I tried to test the bandwidth using the osu_bw in the osu_benchmarks package between GPUs on a GPU cluster, I found that the bandwidth between two GPUs on a same PCIe bus is smaller than the bandwidth between two GPUs on different nodes. I can not figure out the reason. I think the communication between two GPUs on a same PCIe bus should be through PCIe bus directly with the support of GPUDirect v2. Why is it slower than communication between two GPUs on separate nodes?
The following is the bandwidth between two GPUs on same PCIe bus:
# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
# Size Bandwidth (MB/s)
1 0.13
2 0.25
4 0.51
8 1.01
16 2.03
32 4.04
64 8.08
128 16.25
256 34.22
512 67.96
1024 137.11
2048 272.85
4096 540.70
8192 1092.38
16384 2125.43
32768 3134.74
65536 4022.73
131072 4836.31
262144 4944.00
524288 5009.88
1048576 5019.20
2097152 5052.10
4194304 5067.23
And this is the bandwidth between GPUs on two nodes:
# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
# Size Bandwidth (MB/s)
1 0.09
2 0.17
4 0.34
8 0.69
16 1.35
32 2.68
64 5.46
128 10.88
256 21.89
512 43.00
1024 83.51
2048 160.64
4096 297.13
8192 524.53
16384 1547.57
32768 2525.94
65536 3607.36
131072 4556.56
262144 5174.55
524288 5318.21
1048576 5346.60
2097152 5373.96
4194304 5405.89
Thanks,
Ye
More information about the mvapich-discuss
mailing list