[mvapich-discuss] Difference between MVAPICH2 and MVAPICH2-GDR

Thu Jul 9 10:55:23 EDT 2015

I have installed MVAPICH2-GDR and gdrcopy, but when I run the osu_latency, it turned out to be weird.

makai at gpu-cluster-3:~$ $MV2_PATH/bin/mpiexec -hosts 192.168.2.3,192.168.2.4 -n 2 -env MV2_USE_CUDA 1 /opt/mvapich2/gdr/2.1/cuda6.5/gnu/libexec/mvapich2/osu_latency D D
# OSU MPI-CUDA Latency Test
# Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
# Size            Latency (us)
Warning *** The GPU and IB selected are not on the same socket. Do not delever the best performance 
Warning *** The GPU and IB selected are not on the same socket. Do not delever the best performance 
0                         1.67
1                         2.91
2                         3.92
4                         3.99
8                         3.92
16                        3.97
32                      160.67
64                      161.51
128                     162.05
256                     165.20
512                     165.88
1024                    168.92
2048                    176.08
4096                    185.95
8192                     72.63
16384                   261.26
32768                   148.08
65536                   518.37
131072                  143.93
262144                  260.03
524288                  254.41
1048576                 393.54
2097152                 672.47
4194304                1244.69

Why the result became so bad after size became larger than 32?
And I find there is only one CA on my node, why it told me “The GPU and IB selected are not on the same socket. Do not delever the best performance “?

Could you give me some help?
Thanks!

> 在 2015年6月25日，下午8:56，Panda, Dhabaleswar <panda at cse.ohio-state.edu> 写道：
> 
> Thanks for your note. You are mixing-up two concepts: 1) CUDA-aware MPI and 2) GPUDirect RDMA. The 
> CUDA-aware MPI concept allows MPI_Send and MPI_Recv to use data from GPU device directly. 
> GPUDirect RDMA (GDR) allows data to be moved from a GPU to another GPU through through PCI 
> interface using RDMA (say over InfiniBand) without going through the host memory. 
> 
> MVAPICH2 supports only CUDA-aware MPI.
> 
> MVAPICH2-GDR supports CUDA-aware MPI, GPUDirect RDMA (GDR), and many other advanced designs
> related to GPU clusters to exploit performance and scalability. For example, you can get very low 
> D-D latency (close to 2 microsec) with MVAPICH2-GDR. Thus, for GPU clusters with InfiniBand,  
> we strongly recommend the users to use MVAPICH2-GDR. Please take a look at the MVAPICH2-GDR user 
> guide from the following URL for all features and usage guidelines: 
> 
> http://mvapich.cse.ohio-state.edu/userguide/gdr/
> 
> Hope this helps. 
> 
> DK
> 
> 
> 
> ________________________________________
> From: mvapich-discuss-bounces at cse.ohio-state.edu on behalf of makai [makailove123 at 163.com]
> Sent: Thursday, June 25, 2015 1:02 AM
> To: mvapich-discuss at cse.ohio-state.edu
> Subject: [mvapich-discuss] Difference between MVAPICH2 and MVAPICH2-GDR
> 
> I have installed MVAPICH2, and it says that it supports GPUDirect RDMA. MPI_Send and MPI_Recv could use addresses of device for data transmission.
> So, what’s the difference between MVAPICH2 and MVAPICH2-GDR?
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> [attachment]
> 
> winmail.dat
> download: http://u.163.com/t0/r4YIVzg
> 
> <winmail.dat>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150709/512bfd75/attachment.html>