[mvapich-discuss] Announcing the release of MVAPICH2 0.9.5 with
SRQ, intergrated multi-rail and TotalView support
Sayantan Sur
surs at cse.ohio-state.edu
Wed Sep 6 09:44:11 EDT 2006
Hello Pasha,
Pavel Shamis (Pasha) wrote:
> Christian Guggenberger wrote:
>> without getting performance penalties as Sayantan described ? (all
>> MT23108 here are already at least at FW-3.4.0 here - my OFED
>> test-environment even at
>> 3.5.0)
>>
> You will have some performance penalty, but It should not be very
> high. On my machine I see ~ 2-3% difference with osu benchmarks.
I conducted some Gen2 level (before we go higher up to the MPI layer)
experiments to see if the latest firmware (3.5.0) made any difference to
the SRQ performance on PCI-X HCAs. Our experimental machines are Dual
Intel Xeon 3.0 GHz with 2GB main memory. We have MT23108 HCA with fw
(http://www.mellanox.com/downloads/firmware/fw-23108-3_5_000-MHX-CE128-T.bin.zip).
The machine has OpenIB Gen2 based on Linux kernel version 2.6.16.
I used the standard `perftest' distributed by OpenIB. In particular, the
tests send_lat.c and send_bw.c seem to measure the latency and bandwidth
of send/recv operations. I trivially modified them to use SRQ instead of
posting receive to the Queue pair receive queue. The performance numbers
are noted below. Based on these numbers, it seems it is higher than the
2-3% threshold. May I request you post your Gen2 level comparison
numbers? If there is something simple I'm missing, I'd like to correct it.
Thanks,
Sayantan.
===========
Gen2 Latency (us)
#Size SRQ Send/Recv
2 10.66 6.06
4 10.65 6.07
8 10.65 6.12
16 10.71 6.12
32 10.76 6.31
64 10.89 6.51
128 11.04 6.51
256 11.58 7.00
512 12.56 7.99
1024 13.87 9.40
Gen2 Bandwidth (MB/s)
#Size SRQ Send/Recv
2 0.14 0.81
4 0.29 1.68
8 0.57 3.26
16 1.14 6.68
32 2.34 13.46
64 4.55 26.08
128 9.06 51.99
256 18.22 105.34
512 36.44 214.36
1024 75.05 423.72
--
http://www.cse.ohio-state.edu/~surs
More information about the mvapich-discuss
mailing list