[mvapich-discuss] Announcing the release of MVAPICH2 0.9.5 with
SRQ, intergrated multi-rail and TotalView support
Eric A. Borisch
eborisch at ieee.org
Thu Aug 31 10:37:53 EDT 2006
Morning all,
I've installed MVAPICH2 0.9.5 on my system, (nodes are dual P4 3.4GHz, PCI-E
SDR 2-port cards)
I can't seem to get the multi-rail functionality working. It works, however,
with the latest MVAPICH distribution on this system.
Here are the mpich[2]version outputs on the two builds:
MVAPICH2:
Version: 1.0.3
Device: osu_ch3:mrail
Configure Options: --prefix=/share/apps/mvapich2 --with-device=osu_ch3:mrail
--with-rdma=vapi --with-pm=mpd --disable-romio --enable-sharedlibs=gcc
--with-mpe
MVAPICH (Multi-rail):
MPICH Version: 1.2.7
MPICH Release date: $Date: 2005/06/22 16:33:49$
MPICH Patches applied: none
MPICH configure: --with-device=vapi_multirail --with-arch=LINUX
-prefix=/share/apps/mvapich_mr --without-romio --with-mpe --enable-sharedlib
-lib=-L/usr/lib64 -lmtl_common -lvapi -lmosal -lmpga -lpthread
MPICH Device: vapi_multirail
Here are the IMB ping-pong invocations and results:
MVAPICH2:
> [eborisch at rt2 IMB-MPI]$ mpiexec -n 2 -env NUM_HCAS 1 -env NUM_PORTS 2
> ./IMB-MPI1_mvapich2_sharedlib pingpong
> #---------------------------------------------------
> # Intel (R) MPI Benchmark Suite V2.3, MPI-1 part
> #---------------------------------------------------
> # Date : Thu Aug 31 09:24:47 2006
> # Machine : x86_64# System : Linux
> # Release : 2.6.9-22.ELsmp
> # Version : #1 SMP Sat Oct 8 21:32:36 BST 2005
>
> #
> # Minimum message length in bytes: 0
> # Maximum message length in bytes: 4194304
> #
> # MPI_Datatype : MPI_BYTE
> # MPI_Datatype for reductions : MPI_FLOAT
> # MPI_Op : MPI_SUM
> #
> #
>
> # List of Benchmarks to run:
>
> # PingPong
>
> #---------------------------------------------------
> # Benchmarking PingPong
> # #processes = 2
> #---------------------------------------------------
> #bytes #repetitions t[usec] Mbytes/sec
> 0 1000 3.99 0.00
> 1 1000 4.21 0.23
> 2 1000 4.09 0.47
> 4 1000 4.15 0.92
> 8 1000 4.21 1.81
> 16 1000 4.26 3.59
> 32 1000 4.31 7.09
> 64 1000 4.49 13.60
> 128 1000 4.70 25.95
> 256 1000 5.24 46.62
> 512 1000 6.57 74.36
> 1024 1000 8.01 121.86
> 2048 1000 9.50 205.52
> 4096 1000 12.69 307.80
> 8192 1000 24.90 313.75
> 16384 1000 33.25 469.99
> 32768 1000 50.26 621.73
> 65536 640 84.34 741.05
> 131072 320 152.60 819.16
> 262144 160 289.14 864.64
> 524288 80 561.59 890.33
> 1048576 40 1105.97 904.18
> 2097152 20 2199.28 909.39
> 4194304 10 4377.79 913.70
>
MVAPICH (Multi-Rail):
> /share/apps/mvapich_mr/bin/mpirun -np 2 -machinefile machfile
> IMB-MPI1_mvapich_mr pingpong
> #---------------------------------------------------
> # Intel (R) MPI Benchmark Suite V2.3, MPI-1 part
> #---------------------------------------------------
> # Date : Thu Aug 31 09:25:13 2006
> # Machine : x86_64# System : Linux
> # Release : 2.6.9-22.ELsmp
> # Version : #1 SMP Sat Oct 8 21:32:36 BST 2005
>
> #
> # Minimum message length in bytes: 0
> # Maximum message length in bytes: 4194304
> #
> # MPI_Datatype : MPI_BYTE
> # MPI_Datatype for reductions : MPI_FLOAT
> # MPI_Op : MPI_SUM
> #
> #
>
> # List of Benchmarks to run:
>
> # PingPong
>
> #---------------------------------------------------
> # Benchmarking PingPong
> # #processes = 2
> #---------------------------------------------------
> #bytes #repetitions t[usec] Mbytes/sec
> 0 1000 4.36 0.00
> 1 1000 4.56 0.21
> 2 1000 4.48 0.43
> 4 1000 4.33 0.88
> 8 1000 4.77 1.60
> 16 1000 4.74 3.22
> 32 1000 4.50 6.79
> 64 1000 4.97 12.28
> 128 1000 4.82 25.30
> 256 1000 5.28 46.27
> 512 1000 6.65 73.47
> 1024 1000 7.93 123.18
> 2048 1000 9.55 204.58
> 4096 1000 12.99 300.76
> 8192 1000 28.35 275.55
> 16384 1000 33.36 468.32
> 32768 1000 44.43 703.39
> 65536 640 66.83 935.16
> 131072 320 111.99 1116.20
> 262144 160 200.89 1244.44
> 524288 80 378.64 1320.52
> 1048576 40 736.00 1358.70
> 2097152 20 1448.02 1381.19
> 4194304 10 2897.55 1380.48
>
Thanks in advance for any suggestions!
Eric Borisch
Mayo Clinic - Radiology Research
On 8/30/06, Dhabaleswar Panda <panda at cse.ohio-state.edu> wrote:
> The MVAPICH team is pleased to announce the availability of MVAPICH2
> 0.9.5 with the following NEW features:
>
> - Shared Receive Queue (SRQ) and Adaptive RDMA support: These
> features reduce memory usage of the MPI library significantly to
> provide scalability without any degradation in performance.
>
> Performance of applications and memory scalability using SRQ
> and Adaptive RDMA support can be seen by visiting the following
> URL:
>
> http://nowlab.cse.ohio-state.edu/projects/mpi-iba/perf-apps.html
>
> - Integrated multi-rail communication support for both two-sided and
> one-sided operations
> - Multiple queue pairs per port
> - Multiple ports per adapter
> - Multiple adapters
>
> - Support for TotalView debugger
>
> - Auto-detection of Architecture and InfiniBand adapters
>
> More details on all features and supported platforms can be obtained
> by visiting the following URL:
>
> http://nowlab.cse.ohio-state.edu/projects/mpi-iba/mvapich2_features.html
>
> MVAPICH2 0.9.5 continues to deliver excellent performance. Sample
> performance numbers include:
>
> - OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR:
> Two-sided operations:
> - 2.97 microsec one-way latency (4 bytes)
> - 1478 MB/sec unidirectional bandwidth
> - 2658 MB/sec bidirectional bandwidth
>
> One-sided operations:
> - 5.08 microsec Put latency
> - 1484 MB/sec unidirectional Put bandwidth
> - 2658 MB/sec bidirectional Put bandwidth
>
> - OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR (Dual-rail):
> Two-sided operations:
> - 3.01 microsec one-way latency (4 bytes)
> - 2346 MB/sec unidirectional bandwidth
> - 2779 MB/sec bidirectional bandwidth
>
> One-sided operations:
> - 4.70 microsec Put latency
> - 2389 MB/sec unidirectional Put bandwidth
> - 2779 MB/sec bidirectional Put bandwidth
>
> - OpenIB/Gen2 on Opteron with PCI-Ex and IBA-DDR:
> Two-sided operations:
> - 2.71 microsec one-way latency (4 bytes)
> - 1411 MB/sec unidirectional bandwidth
> - 2238 MB/sec bidirectional bandwidth
>
> One-sided operations:
> - 4.28 microsec Put latency
> - 1411 MB/sec unidirectional Put bandwidth
> - 2238 MB/sec bidirectional Put bandwidth
>
> - Solaris uDAPL/IBTL on Opteron with PCI-Ex and IBA-SDR:
> Two-sided operations:
> - 4.81 microsec one-way latency (4 bytes)
> - 981 MB/sec unidirectional bandwidth
> - 1903 MB/sec bidirectional bandwidth
>
> One-sided operations:
> - 7.49 microsec Put latency
> - 981 MB/sec unidirectional Put bandwidth
> - 1903 MB/sec bidirectional Put bandwidth
>
> - OpenIB/Gen2 uDAPL on EM64T with PCI-Ex and IBA-SDR:
> Two-sided operations:
> - 3.56 microsec one-way latency (4 bytes)
> - 964 MB/sec unidirectional bandwidth
> - 1846 MB/sec bidirectional bandwidth
>
> One-sided operations:
> - 6.85 microsec Put latency
> - 964 MB/sec unidirectional Put bandwidth
> - 1846 MB/sec bidirectional Put bandwidth
>
> - OpenIB/Gen2 uDAPL on EM64T with PCI-Ex and IBA-DDR:
> Two-sided operations:
> - 3.18 microsec one-way latency (4 bytes)
> - 1484 MB/sec unidirectional bandwidth
> - 2635 MB/sec bidirectional bandwidth
>
> One-sided operations:
> - 5.41 microsec Put latency
> - 1485 MB/sec unidirectional Put bandwidth
> - 2635 MB/sec bidirectional Put bandwidth
>
> Performance numbers for all other platforms, system configurations and
> operations can be viewed by visiting `Performance' section of the
> project's web page.
>
> With the ADI-3-level design, MVAPICH2 0.9.5 delivers similar
> performance for two-sided operations compared to MVAPICH 0.9.8.
> Performance comparison between MVAPICH2 0.9.5 and MVAPICH 0.9.8 for
> sample applications can be seen by visiting the following URL:
>
> http://nowlab.cse.ohio-state.edu/projects/mpi-iba/perf-apps.html
>
> Organizations and users interested in getting the best performance for
> both two-sided and one-sided operations and also want to exploit
> `multi-threading' and `integrated multi-rail' capabilities may migrate
> from MVAPICH code base to MVAPICH2 code base.
>
> For downloading MVAPICH2 0.9.5 package and accessing the anonymous
> SVN, please visit the following URL:
>
> http://nowlab.cse.ohio-state.edu/projects/mpi-iba/
>
> A stripped down version of this release is also available at the
> OpenIB SVN.
>
> All feedbacks, including bug reports and hints for performance tuning,
> are welcome. Please post it to the mvapich-discuss mailing list.
>
> Thanks,
>
> MVAPICH Team at OSU/NBCL
>
> ======================================================================
> MVAPICH/MVAPICH2 project is currently supported with funding from
> U.S. National Science Foundation, U.S. DOE Office of Science,
> Mellanox, Intel, Cisco Systems, Sun Microsystems and Linux Networx;
> and with equipment support from Advanced Clustering, AMD, Apple,
> Appro, Dell, IBM, Intel, Mellanox, Microway, PathScale, SilverStorm
> and Sun Microsystems. Other technology partner includes Etnus.
> ======================================================================
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at mail.cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
--
Eric A. Borisch
eborisch at ieee.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20060831/46014e14/attachment-0001.html
More information about the mvapich-discuss
mailing list