[mvapich-discuss] IB benchmarks that are a bit strange

Brian Budge brian.budge at gmail.com
Tue Mar 18 23:18:30 EDT 2008


Thanks DK.  The systems are Opteron NUMA.  They both run processors at
the 2.4 GHz and memory is ddr2 667.  They are both running mellanox
infinihost III Lx HCAs.  Both nodes are running linux 2.6.24 with the
infiniband drivers built in.

It definitely has something to do with my head node.  I've been
running tests on 3 nodes to better characterize what I've been seeing,
and in my test I see close to 800 MB/s from slave to slave and about
600 MB/s from either slave to head or head to either slave.

If it's relevant, I'm also running ip over ib, and my head node runs
iptables.  Could this cause the slowdown?

Thanks,
  Brian

On Tue, Mar 18, 2008 at 7:36 PM, Dhabaleswar Panda
<panda at cse.ohio-state.edu> wrote:
> What kind of platforms you are using - Opteron NUMA or Intel? Are the two
>  systems homogeneous in terms of processor speed and memory speed? Do the
>  two systems have identical NICs - hw and firmware? Such information will
>  help to understand this problem better.
>
>  DK
>
>
>
>  On Sun, 16 Mar 2008, Brian Budge wrote:
>
>  > Hi all -
>  >
>  > I am running the osu_bw bandwidth test on my small cluster.  I'm
>  > seeing some slightly strange behavior:  Let's say I have nodes 0 and
>  > 1.  When I launch the bw test from node 0 and run the test on 0 and 1,
>  > I see a max bandwidth of 650 MB/s.  However, when I run from node 1
>  > and run the test on 0 and 1, I see a max bandwidth of close to 850
>  > MB/s.  Does anyone know how I might diagnose/fix this issue?  Has
>  > anyone seen it before?
>  >
>  > Thanks,
>  >   Brian
>  > _______________________________________________
>  > mvapich-discuss mailing list
>  > mvapich-discuss at cse.ohio-state.edu
>  > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>  >
>
>


More information about the mvapich-discuss mailing list