[mvapich-discuss] OSU benchmarks interpretation
Nikita Andreev
lestat at kemsu.ru
Wed Mar 2 07:26:47 EST 2011
Peter,
You were right that I was actually using shared memory. It turned out that
for nodes=2:ppn=1 job specification MAUI runs both jobs on the same node.
JOBNODEMATCHPOLICY EXACTNODE changes this policy.
Now I have for mrail:
osu_bw 1770.10
osu_bibw 3500.82
osu_put_bw 1768.58
osu_put_bibw 3405.17
osu_get_bw 1769.05
Thanks for help.
Regards,
Nikita
-----Original Message-----
From: mvapich-discuss-bounces at cse.ohio-state.edu
[mailto:mvapich-discuss-bounces at cse.ohio-state.edu] On Behalf Of Peter
Kjellstrom
Sent: Wednesday, March 02, 2011 5:00 PM
To: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] OSU benchmarks interpretation
On Wednesday, March 02, 2011 08:03:42 am Nikita Andreev wrote:
> I'm benchmarking bandwidth between two compute nodes equipped with
> Mellanox ConnectX DDR InfiniBand two-port HCAs. I run benchmarks under
> OpenMPI which supports dual-rail configurations.
>
> Results for message size 4194304:
>
> osu_bw 4917.75 MB/s
> osu_bibw 5007.49 MB/s
> osu_put_bw 3489.35 MB/s
> osu_put_bibw 3876.96 MB/s
> osu_get_bw 3482.18 MB/s
This is way too fast for a single DDR ConnectX you're probably running the
test using shared memory on one node.
Expected DDR performance (one port) is roughly:
unidir PCIe 2.5GT: 1400 MB/s
unidir PCIe 5.0GT: 1950 MB/s
bidir: ~2x unidir
Using both ports will on 2.5GT PCIe be pointless (can't even push one port)
and on 5.0GT I'm guessing you'd max out at ~3000 MB/s unidir but have not
tried it myself.
=> Multirail for performace pretty much needs two HCAs.
/Peter
>
>
> I have several questions:
>
>
>
> 1. As far as I understand DDR IB has 16Gb/s data rate. Hence dual-rail
> has 32Gb/s or 4GB/s theoretical peak throughput. But osu_bw shows data
> rate higher than theoretical. How is that possible?
>
>
>
> 2. osu_bw is unidirectional and osu_bibw is bidirectional test. So I
> suppose it should have two times higher throughput but it's almost the
> same as unidirectional.
>
>
>
> 3. RDMA put/get do not involve target node in operation and should be
> faster than ordinary send/recv. Why are they slower?
>
>
>
> Regards,
>
> Nikita
--
-= Peter Kjellström
-= National Supercomputer Centre
More information about the mvapich-discuss
mailing list