[mvapich-discuss] Announcing the release of MVAPICH2 0.9.5 with
SRQ, intergrated multi-rail and TotalView support
Abhinav Vishnu
vishnu at cse.ohio-state.edu
Thu Aug 31 12:39:40 EDT 2006
Hi Eric,
Thanks for trying out MVAPICH2 0.9.5 and report the issue to us.
Also, glad to know that you are seeing excellent performance with MVAPICH
using multi-rail device.
For MVAPICH2, the multi-rail integration has been done for OpenIB/Gen2
device. May i suggest you to try MVAPICH2 with this and
let us know if you are still have performance issues.
Thanks again for trying MVAPICH2. Please keep us posted.
Regards,
- Abhinav
-------------------------------
Abhinav Vishnu,
Graduate Research Associate,
Department Of Comp. Sc. & Engg.
The Ohio State University.
-------------------------------
On Thu, 31 Aug 2006, Eric A. Borisch wrote:
> Morning all,
>
> I've installed MVAPICH2 0.9.5 on my system, (nodes are dual P4 3.4GHz, PCI-E
> SDR 2-port cards)
>
> I can't seem to get the multi-rail functionality working. It works, however,
> with the latest MVAPICH distribution on this system.
>
> Here are the mpich[2]version outputs on the two builds:
>
> MVAPICH2:
>
> Version: 1.0.3
> Device: osu_ch3:mrail
> Configure Options: --prefix=/share/apps/mvapich2 --with-device=osu_ch3:mrail
> --with-rdma=vapi --with-pm=mpd --disable-romio --enable-sharedlibs=gcc
> --with-mpe
>
>
> MVAPICH (Multi-rail):
>
> MPICH Version: 1.2.7
> MPICH Release date: $Date: 2005/06/22 16:33:49$
> MPICH Patches applied: none
> MPICH configure: --with-device=vapi_multirail --with-arch=LINUX
> -prefix=/share/apps/mvapich_mr --without-romio --with-mpe --enable-sharedlib
> -lib=-L/usr/lib64 -lmtl_common -lvapi -lmosal -lmpga -lpthread
> MPICH Device: vapi_multirail
>
>
>
> Here are the IMB ping-pong invocations and results:
>
> MVAPICH2:
>
> > [eborisch at rt2 IMB-MPI]$ mpiexec -n 2 -env NUM_HCAS 1 -env NUM_PORTS 2
> > ./IMB-MPI1_mvapich2_sharedlib pingpong
> > #---------------------------------------------------
> > # Intel (R) MPI Benchmark Suite V2.3, MPI-1 part
> > #---------------------------------------------------
> > # Date : Thu Aug 31 09:24:47 2006
> > # Machine : x86_64# System : Linux
> > # Release : 2.6.9-22.ELsmp
> > # Version : #1 SMP Sat Oct 8 21:32:36 BST 2005
> >
> > #
> > # Minimum message length in bytes: 0
> > # Maximum message length in bytes: 4194304
> > #
> > # MPI_Datatype : MPI_BYTE
> > # MPI_Datatype for reductions : MPI_FLOAT
> > # MPI_Op : MPI_SUM
> > #
> > #
> >
> > # List of Benchmarks to run:
> >
> > # PingPong
> >
> > #---------------------------------------------------
> > # Benchmarking PingPong
> > # #processes = 2
> > #---------------------------------------------------
> > #bytes #repetitions t[usec] Mbytes/sec
> > 0 1000 3.99 0.00
> > 1 1000 4.21 0.23
> > 2 1000 4.09 0.47
> > 4 1000 4.15 0.92
> > 8 1000 4.21 1.81
> > 16 1000 4.26 3.59
> > 32 1000 4.31 7.09
> > 64 1000 4.49 13.60
> > 128 1000 4.70 25.95
> > 256 1000 5.24 46.62
> > 512 1000 6.57 74.36
> > 1024 1000 8.01 121.86
> > 2048 1000 9.50 205.52
> > 4096 1000 12.69 307.80
> > 8192 1000 24.90 313.75
> > 16384 1000 33.25 469.99
> > 32768 1000 50.26 621.73
> > 65536 640 84.34 741.05
> > 131072 320 152.60 819.16
> > 262144 160 289.14 864.64
> > 524288 80 561.59 890.33
> > 1048576 40 1105.97 904.18
> > 2097152 20 2199.28 909.39
> > 4194304 10 4377.79 913.70
> >
>
> MVAPICH (Multi-Rail):
>
> > /share/apps/mvapich_mr/bin/mpirun -np 2 -machinefile machfile
> > IMB-MPI1_mvapich_mr pingpong
> > #---------------------------------------------------
> > # Intel (R) MPI Benchmark Suite V2.3, MPI-1 part
> > #---------------------------------------------------
> > # Date : Thu Aug 31 09:25:13 2006
> > # Machine : x86_64# System : Linux
> > # Release : 2.6.9-22.ELsmp
> > # Version : #1 SMP Sat Oct 8 21:32:36 BST 2005
> >
> > #
> > # Minimum message length in bytes: 0
> > # Maximum message length in bytes: 4194304
> > #
> > # MPI_Datatype : MPI_BYTE
> > # MPI_Datatype for reductions : MPI_FLOAT
> > # MPI_Op : MPI_SUM
> > #
> > #
> >
> > # List of Benchmarks to run:
> >
> > # PingPong
> >
> > #---------------------------------------------------
> > # Benchmarking PingPong
> > # #processes = 2
> > #---------------------------------------------------
> > #bytes #repetitions t[usec] Mbytes/sec
> > 0 1000 4.36 0.00
> > 1 1000 4.56 0.21
> > 2 1000 4.48 0.43
> > 4 1000 4.33 0.88
> > 8 1000 4.77 1.60
> > 16 1000 4.74 3.22
> > 32 1000 4.50 6.79
> > 64 1000 4.97 12.28
> > 128 1000 4.82 25.30
> > 256 1000 5.28 46.27
> > 512 1000 6.65 73.47
> > 1024 1000 7.93 123.18
> > 2048 1000 9.55 204.58
> > 4096 1000 12.99 300.76
> > 8192 1000 28.35 275.55
> > 16384 1000 33.36 468.32
> > 32768 1000 44.43 703.39
> > 65536 640 66.83 935.16
> > 131072 320 111.99 1116.20
> > 262144 160 200.89 1244.44
> > 524288 80 378.64 1320.52
> > 1048576 40 736.00 1358.70
> > 2097152 20 1448.02 1381.19
> > 4194304 10 2897.55 1380.48
> >
>
> Thanks in advance for any suggestions!
> Eric Borisch
> Mayo Clinic - Radiology Research
>
>
> On 8/30/06, Dhabaleswar Panda <panda at cse.ohio-state.edu> wrote:
> > The MVAPICH team is pleased to announce the availability of MVAPICH2
> > 0.9.5 with the following NEW features:
> >
> > - Shared Receive Queue (SRQ) and Adaptive RDMA support: These
> > features reduce memory usage of the MPI library significantly to
> > provide scalability without any degradation in performance.
> >
> > Performance of applications and memory scalability using SRQ
> > and Adaptive RDMA support can be seen by visiting the following
> > URL:
> >
> > http://nowlab.cse.ohio-state.edu/projects/mpi-iba/perf-apps.html
> >
> > - Integrated multi-rail communication support for both two-sided and
> > one-sided operations
> > - Multiple queue pairs per port
> > - Multiple ports per adapter
> > - Multiple adapters
> >
> > - Support for TotalView debugger
> >
> > - Auto-detection of Architecture and InfiniBand adapters
> >
> > More details on all features and supported platforms can be obtained
> > by visiting the following URL:
> >
> > http://nowlab.cse.ohio-state.edu/projects/mpi-iba/mvapich2_features.html
> >
> > MVAPICH2 0.9.5 continues to deliver excellent performance. Sample
> > performance numbers include:
> >
> > - OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR:
> > Two-sided operations:
> > - 2.97 microsec one-way latency (4 bytes)
> > - 1478 MB/sec unidirectional bandwidth
> > - 2658 MB/sec bidirectional bandwidth
> >
> > One-sided operations:
> > - 5.08 microsec Put latency
> > - 1484 MB/sec unidirectional Put bandwidth
> > - 2658 MB/sec bidirectional Put bandwidth
> >
> > - OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR (Dual-rail):
> > Two-sided operations:
> > - 3.01 microsec one-way latency (4 bytes)
> > - 2346 MB/sec unidirectional bandwidth
> > - 2779 MB/sec bidirectional bandwidth
> >
> > One-sided operations:
> > - 4.70 microsec Put latency
> > - 2389 MB/sec unidirectional Put bandwidth
> > - 2779 MB/sec bidirectional Put bandwidth
> >
> > - OpenIB/Gen2 on Opteron with PCI-Ex and IBA-DDR:
> > Two-sided operations:
> > - 2.71 microsec one-way latency (4 bytes)
> > - 1411 MB/sec unidirectional bandwidth
> > - 2238 MB/sec bidirectional bandwidth
> >
> > One-sided operations:
> > - 4.28 microsec Put latency
> > - 1411 MB/sec unidirectional Put bandwidth
> > - 2238 MB/sec bidirectional Put bandwidth
> >
> > - Solaris uDAPL/IBTL on Opteron with PCI-Ex and IBA-SDR:
> > Two-sided operations:
> > - 4.81 microsec one-way latency (4 bytes)
> > - 981 MB/sec unidirectional bandwidth
> > - 1903 MB/sec bidirectional bandwidth
> >
> > One-sided operations:
> > - 7.49 microsec Put latency
> > - 981 MB/sec unidirectional Put bandwidth
> > - 1903 MB/sec bidirectional Put bandwidth
> >
> > - OpenIB/Gen2 uDAPL on EM64T with PCI-Ex and IBA-SDR:
> > Two-sided operations:
> > - 3.56 microsec one-way latency (4 bytes)
> > - 964 MB/sec unidirectional bandwidth
> > - 1846 MB/sec bidirectional bandwidth
> >
> > One-sided operations:
> > - 6.85 microsec Put latency
> > - 964 MB/sec unidirectional Put bandwidth
> > - 1846 MB/sec bidirectional Put bandwidth
> >
> > - OpenIB/Gen2 uDAPL on EM64T with PCI-Ex and IBA-DDR:
> > Two-sided operations:
> > - 3.18 microsec one-way latency (4 bytes)
> > - 1484 MB/sec unidirectional bandwidth
> > - 2635 MB/sec bidirectional bandwidth
> >
> > One-sided operations:
> > - 5.41 microsec Put latency
> > - 1485 MB/sec unidirectional Put bandwidth
> > - 2635 MB/sec bidirectional Put bandwidth
> >
> > Performance numbers for all other platforms, system configurations and
> > operations can be viewed by visiting `Performance' section of the
> > project's web page.
> >
> > With the ADI-3-level design, MVAPICH2 0.9.5 delivers similar
> > performance for two-sided operations compared to MVAPICH 0.9.8.
> > Performance comparison between MVAPICH2 0.9.5 and MVAPICH 0.9.8 for
> > sample applications can be seen by visiting the following URL:
> >
> > http://nowlab.cse.ohio-state.edu/projects/mpi-iba/perf-apps.html
> >
> > Organizations and users interested in getting the best performance for
> > both two-sided and one-sided operations and also want to exploit
> > `multi-threading' and `integrated multi-rail' capabilities may migrate
> > from MVAPICH code base to MVAPICH2 code base.
> >
> > For downloading MVAPICH2 0.9.5 package and accessing the anonymous
> > SVN, please visit the following URL:
> >
> > http://nowlab.cse.ohio-state.edu/projects/mpi-iba/
> >
> > A stripped down version of this release is also available at the
> > OpenIB SVN.
> >
> > All feedbacks, including bug reports and hints for performance tuning,
> > are welcome. Please post it to the mvapich-discuss mailing list.
> >
> > Thanks,
> >
> > MVAPICH Team at OSU/NBCL
> >
> > ======================================================================
> > MVAPICH/MVAPICH2 project is currently supported with funding from
> > U.S. National Science Foundation, U.S. DOE Office of Science,
> > Mellanox, Intel, Cisco Systems, Sun Microsystems and Linux Networx;
> > and with equipment support from Advanced Clustering, AMD, Apple,
> > Appro, Dell, IBM, Intel, Mellanox, Microway, PathScale, SilverStorm
> > and Sun Microsystems. Other technology partner includes Etnus.
> > ======================================================================
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at mail.cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
>
>
>
> --
> Eric A. Borisch
> eborisch at ieee.org
>
More information about the mvapich-discuss
mailing list