[mvapich-discuss] Announcing the release of OSU INAM 0.9.4

Panda, Dhabaleswar panda at cse.ohio-state.edu
Sat Nov 10 11:53:20 EST 2018


The MVAPICH team is pleased to announce the release of OSU InfiniBand Network
Analysis and Monitoring (INAM) Tool 0.9.4.

OSU INAM monitors InfiniBand clusters in real time by querying various subnet
management entities in the network. It is also capable of interacting with the
MVAPICH2-X software stack to gain insights into the communication pattern of the
application and classify the data transferred into Point-to-Point, Collective
and Remote Memory Access (RMA). OSU INAM can also remotely monitor several
parameters of MPI processes in conjunction with MVAPICH2-X.

OSU INAM 0.9.4 (11/10/2018)

* Major Features & Enhancements (since 0.9.3):
    - Enhanced performance for fabric discovery using optimized OpenMP-based
      multi-threaded designs
    - Ability to gather InfiniBand performance counters at sub-second
      granularity for very large (>2,000 nodes) clusters
        - Redesign the database layout to reduce database size
    - Enhanced fault tolerance for database operations
        - Thanks to Trey Dockendorf @ OSC for the feedback
    - OpenMP-based multi-threaded designs to handle database purge,
      read, and insert operations simultaneously
    - Improved database purging time by using bulk deletes
    - Tune database timeouts to handle very long database operations
    - Improved debugging support by introducing several debugging levels

* Bug Fixes (since 0.9.3):
    - Fix issue with web-based front end crashing and not being restarted
      automatically
    - Fix issue with locating node when searching network graph by node name
      on the web-based front end
    - Handle unexpected characters as input in the search boxes on the
      web-based front end
    - Handle negative and incorrect values for runtime parameters in the
      config file gracefully
    - Fix issue with marking MPI jobs as complete
    - Fix issue where osuinamd was not terminating properly after a crash
    - Gracefully handle the error on Network view due to timeout
    - Gracefully handle the error when the database is full for osuinamd
    - Automatically reconnect to MySQL daemon if the connection is lost
    - Handle restarting MySQL service automatically
        - Thanks to Trey Dockendorf @ OSC for the feedback

For downloading OSU INAM v0.9.4 and associated user guide, please visit the
following URL:

http://mvapich.cse.ohio-state.edu

All questions, feedback, bug reports, hints for performance tuning, and
enhancements are welcome. Please post it to the mvapich-discuss
mailing list (mvapich-discuss at cse.ohio-state.edu).

Thanks,

The MVAPICH Team

PS: We are also happy to inform that the number of organizations using
MVAPICH2 libraries (and registered at the MVAPICH site) has crossed
2,950 worldwide (in 86 countries). The number of downloads from the
MVAPICH site has crossed 505,000 (0.5 million).  The MVAPICH team
would like to thank all its users and organizations!!



More information about the mvapich-discuss mailing list