[mvapich-discuss] Announing the Release of MVAPICH2 2.0rc1, MVAPICH2-X 2.0rc1 and OSU Micro-Benchmarks (OMB) 4.3
Panda, Dhabaleswar
panda at cse.ohio-state.edu
Tue Mar 25 03:06:06 EDT 2014
The MVAPICH team is pleased to announce the release of MVAPICH2
2.0rc1, MVAPICH2-X 2.0rc1 (Hybrid MPI+PGAS (OpenSHMEM) with Unified
Communication Runtime) and OSU Micro-Benchmarks (OMB) 4.3.
Features, Enhancements, and Bug Fixes for MVAPICH2 2.0rc1 (since
MVAPICH2 2.0b release) are listed here.
* Features and Enhancements (since 2.0b):
- Based on MPICH-3.1
- Enhanced direct RDMA based designs for MPI_Put and MPI_Get operations
in OFA-IB-CH3 channel
- Optimized communication when using MPI_Win_allocate for OFA-IB-CH3
channel
- MPI-3 RMA support for CH3-PSM channel
- Multi-rail support for UD-Hybrid channel
- Optimized and tuned blocking and non-blocking collectives for
OFA-IB-CH3, OFA-IB-Nemesis, and CH3-PSM channels
- Improved hierarchical job startup performance
- Optimized sub-array data-type processing for GPU-to-GPU communication
- Tuning for Mellanox Connect-IB adapters
- Updated hwloc to version 1.8
- Added options to specify CUDA library paths
- Deprecation of uDAPL-CH3 channel
* Bug-Fixes (since 2.0b):
- Fix issues related to MPI-3 RMA locks
- Fix an issue related to MPI-3 dynamic window
- Fix issues related to MPI_Win_allocate backed by shared memory
- Fix issues related to large message transfers for OFA-IB-CH3 and
OFA-IB-Nemesis channels
- Fix warning in job launch, when using DPM
- Fix an issue related to MPI atomic operations on HCAs without atomics
support
- Fixed an issue related to selection of compiler. (We prefer the GNU,
Intel, PGI, and Ekopath compilers in that order).
- Thanks to Uday R Bondhugula from IISc for the report
- Fix an issue in message coalescing
- Prevent printing out inter-node runtime parameters for pure intra-node
runs
- Thanks to Jerome Vienne from TACC for the report
- Fix an issue related to ordering of messages for GPU-to-GPU transfers
- Fix a few memory leaks and warnings
MVAPICH2-X 2.0rc1 software package provides support for hybrid
MPI+PGAS (UPC and OpenSHMEM) programming models with unified
communication runtime for emerging exascale systems. This software
package provides flexibility for users to write applications using the
following programming models with a unified communication runtime:
MPI, MPI+OpenMP, pure UPC, and pure OpenSHMEM programs as well as
hybrid MPI(+OpenMP) + PGAS (UPC and OpenSHMEM) programs.
Features, enhancements and bug-fixes for MVAPICH2-X 2.0rc1 (since
MVAPICH2-X 2.0b) are as follows:
* Features and Enhancements (since 2.0b):
- OpenSHMEM Features
- Based on OpenSHMEM reference implementation 1.0f
- Improved intra-node communication performance using
Shared memory and Cross Memory Attach (CMA)
- UPC Features
- Based on Berkeley UPC 2.18.0 (contains changes/additions
in preparation for upcoming UPC 1.3 specification)
- Optimized UPC collectives (improved performance for
upc_all_broadcast, upc_all_scatter, upc_all_gather,
upc_all_gather_all, and upc_all_exchange)
- MPI Features
- Based on MVAPICH2 2.0rc1 (OFA-IB-CH3 interface)
- Unified Runtime Features
- Based on MVAPICH2 2.0rc1 (OFA-IB-CH3 interface). All the
runtime features enabled by default in OFA-IB-CH3 interface
of MVAPICH2 2.0rc1 are available in MVAPICH2-X 2.0rc1
* Bug Fixes (since 2.0b):
- OpenSHMEM Bug Fixes
- Fix an issue related to atomics on HCAs without atomics support
New features and Enhancements of OSU Micro-Benchmarks (OMB) 4.3 (since
OMB 4.2 release) are listed here.
* New Features & Enhancements (since 4.2)
- This new suite includes several new (or updated) benchmarks to
measure performance of MPI-3 RMA communication operations with
options to select different window creation (WIN_CREATE,
WIN_DYNAMIC, and WIN_ALLOCATE) and synchronization functions
(LOCK, PSCW, FENCE, FLUSH, FLUSH_LOCAL, and LOCK_ALL) in each
benchmark
* osu_acc_latency
* osu_cas_latency
* osu_fop_latency
* osu_get_acc_latency
* osu_get_bw
* osu_get_latency
* osu_put_bibw
* osu_put_bw
* osu_put_latency
- New UPC Collective Benchmarks
* osu_upc_all_barrier
* osu_upc_all_broadcast
* osu_upc_all_exchange
* osu_upc_all_gather
* osu_upc_all_gather_all
* osu_upc_all_reduce
* osu_upc_all_scatter
- Build MPI3 benchmarks when MPI library support is detected
* Bug Fixes (since 4.2)
- Add shmem_quiet() in OpenSHMEM Message Rate benchmark to ensure all
previously issued operations are completed
- Allocate pWrk from symmetric heap in OpenSHMEM Reduce benchmark
For downloading MVAPICH2 2.0rc1, MVAPICH2-X 2.0rc1, OMB 4.3,
associated user guides, quick start guide, and accessing the SVN,
please visit the following URL:
http://mvapich.cse.ohio-state.edu
All questions, feedbacks, bug reports, hints for performance tuning,
patches and enhancements are welcome. Please post it to the
mvapich-discuss mailing list (mvapich-discuss at cse.ohio-state.edu).
Thanks,
The MVAPICH Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140325/eea68f0a/attachment.html>
More information about the mvapich-discuss
mailing list