[mvapich-discuss] Announcing the release of OSU INAM 0.9.4
Panda, Dhabaleswar
panda at cse.ohio-state.edu
Sat Nov 10 11:53:20 EST 2018
The MVAPICH team is pleased to announce the release of OSU InfiniBand Network
Analysis and Monitoring (INAM) Tool 0.9.4.
OSU INAM monitors InfiniBand clusters in real time by querying various subnet
management entities in the network. It is also capable of interacting with the
MVAPICH2-X software stack to gain insights into the communication pattern of the
application and classify the data transferred into Point-to-Point, Collective
and Remote Memory Access (RMA). OSU INAM can also remotely monitor several
parameters of MPI processes in conjunction with MVAPICH2-X.
OSU INAM 0.9.4 (11/10/2018)
* Major Features & Enhancements (since 0.9.3):
- Enhanced performance for fabric discovery using optimized OpenMP-based
multi-threaded designs
- Ability to gather InfiniBand performance counters at sub-second
granularity for very large (>2,000 nodes) clusters
- Redesign the database layout to reduce database size
- Enhanced fault tolerance for database operations
- Thanks to Trey Dockendorf @ OSC for the feedback
- OpenMP-based multi-threaded designs to handle database purge,
read, and insert operations simultaneously
- Improved database purging time by using bulk deletes
- Tune database timeouts to handle very long database operations
- Improved debugging support by introducing several debugging levels
* Bug Fixes (since 0.9.3):
- Fix issue with web-based front end crashing and not being restarted
automatically
- Fix issue with locating node when searching network graph by node name
on the web-based front end
- Handle unexpected characters as input in the search boxes on the
web-based front end
- Handle negative and incorrect values for runtime parameters in the
config file gracefully
- Fix issue with marking MPI jobs as complete
- Fix issue where osuinamd was not terminating properly after a crash
- Gracefully handle the error on Network view due to timeout
- Gracefully handle the error when the database is full for osuinamd
- Automatically reconnect to MySQL daemon if the connection is lost
- Handle restarting MySQL service automatically
- Thanks to Trey Dockendorf @ OSC for the feedback
For downloading OSU INAM v0.9.4 and associated user guide, please visit the
following URL:
http://mvapich.cse.ohio-state.edu
All questions, feedback, bug reports, hints for performance tuning, and
enhancements are welcome. Please post it to the mvapich-discuss
mailing list (mvapich-discuss at cse.ohio-state.edu).
Thanks,
The MVAPICH Team
PS: We are also happy to inform that the number of organizations using
MVAPICH2 libraries (and registered at the MVAPICH site) has crossed
2,950 worldwide (in 86 countries). The number of downloads from the
MVAPICH site has crossed 505,000 (0.5 million). The MVAPICH team
would like to thank all its users and organizations!!
More information about the mvapich-discuss
mailing list