[mvapich-discuss] Announcing the Release of MVAPICH2 1.4.1
Dhabaleswar Panda
panda at cse.ohio-state.edu
Fri Mar 12 22:44:11 EST 2010
The MVAPICH team is pleased to announce the release of MVAPICH2 1.4.1
with the following enhancements and bug fixes:
* Enhancements since mvapich2-1.4
- MPMD launch capability to mpirun_rsh
- Portable Hardware Locality (hwloc) support,
- Patch suggested by Dr. Bernd Kallies <kallies at zib.de>
- Multi-port support for iWARP
- Enhanced iWARP design for scalability to higher process count
- Ring based startup support for RDMAoE
* Bug fixes since mvapich2-1.4
- Fixes for MPE and other profiling tools
- As suggested by Anthony Chan (chan at mcs.anl.gov)
- Fixes for finalization issue with dynamic process management
- Removed overrides to PSM_SHAREDCONTEXT, PSM_SHAREDCONTEXTS_MAX
variables.
- Suggested by Ben Truscott <b.s.truscott at bristol.ac.uk>.
- Fixing the error check for buffer aliasing in MPI_Reduce
- Suggested by Dr. Rajeev Thakur <thakur at mcs.anl.gov>
- Fix Totalview integration for RHEL5
- Update simplemake to handle build timestamp issues
- Fixes for --enable-g={mem, meminit}
- Improved logic to control the receive and send requests to handle
the limitation of CQ Depth on iWARP
- Fixing assertion failures with IMB-EXT tests
- VBUF size for very small iWARP clusters bumped up to 33K
- Replace internal mallocs with MPIU_Malloc uniformly for correct
tracing with --enable-g=mem
- Fixing multi-port for iWARP
- Fix memory leaks
- Shared-memory reduce fixes for MPI_Reduce invoked with
MPI_IN_PLACE
- Handling RDMA_CM_EVENT_TIMEWAIT_EXIT event
- Fix for threaded-ctxdup mpich2 test
- Detecting spawn errors
- Patch contributed by Dr. Bernd Kallies <kallies at zib.de>
- IMB-EXT fixes reported by Yutaka from Cray Japan
- Fix alltoall assertion error when LiMIC2 is used
MVAPICH2 1.4.1 is being made available with OFED 1.5.1. It continues
to deliver excellent performance. Sample performance numbers include:
OpenFabrics/Gen2 on Nehalem quad-core (2.4 GHz) with PCIe-Gen2
and ConnectX-QDR (Two-sided Operations):
- 1.57 microsec one-way latency (4 bytes)
- 3026 MB/sec unidirectional bandwidth
- 5858 MB/sec bidirectional bandwidth
QLogic InfiniPath Support on Nehalem quad-core (2.4 GHz) with
PCIe-Gen2 and QLogic-DDR (Two-sided Operations):
- 2.23 microsec one-way latency (4 bytes)
- 1883 MB/sec unidirectional bandwidth
- 3286 MB/sec bidirectional bandwidth
OpenFabrics/Gen2-RDMAoE (RDMA over Ethernet) Support on
Nehalem quad-core (2.4 GHz) with ConnectX-EN
(Two-sided operations):
- 3.29 microsec one-way latency (4 bytes)
- 1143 MB/sec unidirectional bandwidth
- 2283 MB/sec bidirectional bandwidth
Intra-node performance on Nehalem quad-core (2.4GHz)
(Two-sided operations, intra-socket)
- 0.35 microsec one-way latency (4 bytes)
- 9154 MB/sec unidirectional bandwidth, with and without LiMIC2
- 11787 MB/sec bidirectional bandwidth with LiMIC2
Performance numbers for several other platforms, system configurations
and operations (such as collectives) can be viewed by visiting
`Performance' section of the project's web page.
For downloading MVAPICH2 1.4.1, associated user guide and accessing
the SVN, please visit the following URL:
http://mvapich.cse.ohio-state.edu
All questions, feedbacks, bug reports, hints for performance tuning,
patches and enhancements are welcome. Please post it to the
mvapich-discuss mailing list (mvapich-discuss at cse.ohio-state.edu).
Thanks,
The MVAPICH Team
More information about the mvapich-discuss
mailing list