[mvapich-discuss] OSU Benchmark pt2pt bandwidth

Panda, Dhabaleswar panda at cse.ohio-state.edu
Thu May 15 10:25:23 EDT 2014


Dear Prof. Tan, 

Thanks for your note.

- There is nothing wrong with the benchmark. This is a bandwidth benchmark which
measures the maximum bandwidth that can be achieved by sending many back-to-back messages.
This is consistent with other benchmarks which measure bandwidth.

- For InfiniBand, you need to register the buffers before doing the communication. Thus, the 
first time you use any buffer, you will see overhead. However, MVAPICH2 stack has mechanisms
like `registration cache' which helps to avoid the overhead for successive accesses to the same 
buffer. That is the behavior you are seeing in your runs.

- Application communication characteristics are always diverse with all different requirements.
Application writers also use all different communication patterns. They can not be
always reflected at a micro-benchmark level. However, the micro-benchmarks guide the
application writers to understand the impact of their desired (or alternative) communication patterns. 

- The micro-benchmarks can also be modified suitably to reflect the communication 
characteristics of the application. For example, if the application uses only a few
communication, you can reduce the number of iterations to mimic it. Similarly, if you do not
have back-to-back messages and only have one communication and you want to see the 
impact on latency, you can use the latency benchmark. If the application is sending data 
from different buffers, you can also modify the benchmarks to do this. 

Hope this answers your questions. 

Thanks, 

DK

________________________________________
From: mvapich-discuss-bounces at cse.ohio-state.edu on behalf of Tan Guangming [guangming.tan at gmail.com]
Sent: Wednesday, May 14, 2014 10:07 AM
To: mvapich-discuss at cse.ohio-state.edu
Subject: [mvapich-discuss] OSU Benchmark pt2pt bandwidth

The benchmark code in file "osu_bw.c" measure bandwidth by skipping
the first one/two communication operation and all commucations operate
on the same data and buffers. In fact, the first one acheive much
lower bandwidth. For example, we transfer 77MB messages, the first one
is only 1.1GB/s.
# OSU MPI Bandwidth Test v4.3
# Size      Bandwidth (MB/s)
loop 0: t = 0.067385 bw = 1151512533.291215
loop 1: t = 0.024156 bw = 3212217392.977516
loop 2: t = 0.024135 bw = 3215041557.475585
loop 3: t = 0.024151 bw = 3212915039.307534
loop 4: t = 0.024132 bw = 3215422723.669898
loop 5: t = 0.024151 bw = 3212883321.536630
loop 6: t = 0.024128 bw = 3215931085.875594
loop 7: t = 0.024154 bw = 3212502757.126178
loop 8: t = 0.024129 bw = 3215835755.718114
loop 9: t = 0.024153 bw = 3212629601.912008
loop 10: t = 0.024132 bw = 3215422723.669898
loop 11: t = 0.024154 bw = 3212502757.126178
loop 12: t = 0.024138 bw = 3214596977.783116
loop 13: t = 0.024156 bw = 3212249097.601548
loop 14: t = 0.024135 bw = 3215041557.475585
loop 15: t = 0.024125 bw = 3216344248.544254
loop 16: t = 0.024129 bw = 3215835755.718114
loop 17: t = 0.024154 bw = 3212471047.494779
loop 18: t = 0.024129 bw = 3215803980.254889
loop 19: t = 0.024152 bw = 3212788171.981205
loop 20: t = 0.024132 bw = 3215422723.669898
loop 21: t = 0.024151 bw = 3212915039.307534
77594624             3213.57

However, a real application doesn't match this commuication mode. In
general, every communication operates on different data (even differnt
buffer). Thus, is there any way to improve?


_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu
http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 4996 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140515/47b9d571/attachment.bin>


More information about the mvapich-discuss mailing list