[mvapich-discuss] Announcing the Release of MVAPICH2 1.4.1

Dhabaleswar Panda panda at cse.ohio-state.edu
Fri Mar 12 22:44:11 EST 2010


The MVAPICH team is pleased to announce the release of MVAPICH2 1.4.1
with the following enhancements and bug fixes:

* Enhancements since mvapich2-1.4
   - MPMD launch capability to mpirun_rsh
   - Portable Hardware Locality (hwloc) support,
        - Patch suggested by Dr. Bernd Kallies <kallies at zib.de>
   - Multi-port support for iWARP
   - Enhanced iWARP design for scalability to higher process count
   - Ring based startup support for RDMAoE

* Bug fixes since mvapich2-1.4
   - Fixes for MPE and other profiling tools
        - As suggested by Anthony Chan (chan at mcs.anl.gov)
   - Fixes for finalization issue with dynamic process management
   - Removed overrides to PSM_SHAREDCONTEXT, PSM_SHAREDCONTEXTS_MAX
     variables.
        - Suggested by Ben Truscott <b.s.truscott at bristol.ac.uk>.
   - Fixing the error check for buffer aliasing in MPI_Reduce
        - Suggested by Dr. Rajeev Thakur <thakur at mcs.anl.gov>
   - Fix Totalview integration for RHEL5
   - Update simplemake to handle build timestamp issues
   - Fixes for --enable-g={mem, meminit}
   - Improved logic to control the receive and send requests to handle
     the limitation of CQ Depth on iWARP
   - Fixing assertion failures with IMB-EXT tests
   - VBUF size for very small iWARP clusters bumped up to 33K
   - Replace internal mallocs with MPIU_Malloc uniformly for correct
     tracing with --enable-g=mem
   - Fixing multi-port for iWARP
   - Fix memory leaks
   - Shared-memory reduce fixes for MPI_Reduce invoked with
     MPI_IN_PLACE
   - Handling RDMA_CM_EVENT_TIMEWAIT_EXIT event
   - Fix for threaded-ctxdup mpich2 test
   - Detecting spawn errors
        - Patch contributed by Dr. Bernd Kallies <kallies at zib.de>
   - IMB-EXT fixes reported by Yutaka from Cray Japan
   - Fix alltoall assertion error when LiMIC2 is used

MVAPICH2 1.4.1 is being made available with OFED 1.5.1. It continues
to deliver excellent performance. Sample performance numbers include:

  OpenFabrics/Gen2 on Nehalem quad-core (2.4 GHz) with PCIe-Gen2
      and ConnectX-QDR (Two-sided Operations):
        - 1.57 microsec one-way latency (4 bytes)
        - 3026 MB/sec unidirectional bandwidth
        - 5858 MB/sec bidirectional bandwidth

  QLogic InfiniPath Support on Nehalem quad-core (2.4 GHz) with
      PCIe-Gen2 and QLogic-DDR (Two-sided Operations):
        - 2.23 microsec one-way latency (4 bytes)
        - 1883 MB/sec unidirectional bandwidth
        - 3286 MB/sec bidirectional bandwidth

  OpenFabrics/Gen2-RDMAoE (RDMA over Ethernet) Support on
      Nehalem quad-core (2.4 GHz) with ConnectX-EN
      (Two-sided operations):
        - 3.29 microsec one-way latency (4 bytes)
        - 1143 MB/sec unidirectional bandwidth
        - 2283 MB/sec bidirectional bandwidth

  Intra-node performance on Nehalem quad-core (2.4GHz)
      (Two-sided operations, intra-socket)
        - 0.35 microsec one-way latency (4 bytes)
        - 9154 MB/sec unidirectional bandwidth, with and without LiMIC2
        - 11787 MB/sec bidirectional bandwidth with LiMIC2

Performance numbers for several other platforms, system configurations
and operations (such as collectives) can be viewed by visiting
`Performance' section of the project's web page.

For downloading MVAPICH2 1.4.1, associated user guide and accessing
the SVN, please visit the following URL:

http://mvapich.cse.ohio-state.edu

All questions, feedbacks, bug reports, hints for performance tuning,
patches and enhancements are welcome. Please post it to the
mvapich-discuss mailing list (mvapich-discuss at cse.ohio-state.edu).

Thanks,

The MVAPICH Team




More information about the mvapich-discuss mailing list