[mvapich-discuss] Announcing the release of OSU INAM 0.9.5

Subramoni, Hari subramoni.1 at osu.edu
Wed Dec 18 22:53:37 EST 2019


The MVAPICH team is pleased to announce the release of OSU InfiniBand Network
Analysis and Monitoring (INAM) Tool 0.9.5.

OSU INAM monitors InfiniBand clusters in real time by querying various subnet
management entities in the network. It is also capable of interacting with the
MVAPICH2-X software stack to gain insights into the communication pattern of the
application and classify the data transferred into Point-to-Point, Collective
and Remote Memory Access (RMA). OSU INAM can also remotely monitor several
parameters of MPI processes in conjunction with MVAPICH2-X.

OSU INAM 0.9.5 (12/18/2019)

* Major Features & Enhancements (since 0.9.4):
    - Support for 64 bit InfiniBand port counters
    - Optimized port counters API to fetch minimal data
    - Support for PBS job scheduler
    - Support multiple job schedulers on the same fabric
        - Thanks to Trey Dockendorf @OSC for the feedback
    - Support for InfiniBand port counters in live jobs page, live nodes page,
      historical jobs, and historical nodes pages
    - Support to display Job-level and Node-level CPU, Virtual Memory, and
      Communication Buffer utilization information for historical jobs
        - Thanks to Heechang Na @OSC for the feedback
    - Support to search switches with name and lid in historical switches page
        - Thanks to Heechang Na @OSC for the feedback
    - Support to update charts when the user changes time frame in historical
      jobs, nodes pages
        - Thanks to Heechang Na @OSC for the feedback
    - Optimized historical replay of the network view to yield quicker results
    - Support for adding user-defined labels to switches for better
      readability and usability
        - Thanks to Trey Dockendorf @OSC for the feedback
    - Support to view connection information at port level granularity for
      each switch
        - Thanks to Heechang Na @OSC for the feedback
    - Support to view information about all jobs running on the cluster
      in live node page
    - Added information tooltips on various charts throughout OSU INAM
    - Added interval of querying and reading information to historical jobs,
      switches and nodes page
    - Support to configure refresh rates for network topology and links
    - Support authentication for accessing the OSU INAM webpage
    - Accelerated database purging capability
    - Stabilized rendering of live network view
    - Support for interpolation of process and port counters charts in live
      job page
    - Added logging to monitor topology refresh time
    - Added support to choose MPI ranks to visualize for Link Info page
    - Compatible with OFED v4.5.1

* Bug Fixes (since 0.9.4):
    - Fix out of index issues with very large databases sizes
    - Fix issue with updating PhantomJS cache files when topology changes
    - Fix issue with rendering of Network view when PhantomJS is disabled
    - Fix issue with Aggregate mode for port counters
    - Fix issue with link info page for overlapping MPI ranks and height of
      charts
    - Fix issue with Phantom Read and read/write consistency on the website
    - Thanks to Heechang Na @OSC for reporting the issue
    - Fix issue with calculating link utilization based on querying interval
    - Thanks to Heechang Na @OSC for reporting the issue
    - Fix issue with pattern-based node name search
        - Thanks to Heechang Na @OSC for reporting the issue
    - Fix issue with correctly initializing data points for port counters
    - Fix issue with updating performance charts for different metrics
    - Fix issue with searching and filtering nodes and jobs on network view
    - Fixed labeling issues for charts
    - Fix issue with correctly handling changes in OSU INAM configuration file
    - Fix issue with displaying port data counters and port error counters
      for end compute nodes
        - Thanks to Trey Dockendorf @OSC for reporting the issue
    - Gracefully handle error when job scheduler component fails
    - Fix issue when searching for historical jobs
    - Handled jobs without job ID in Current Jobs page
    - Fix for search switch by name issue in Network page
        - Thanks to Trey Dockendorf @OSC for reporting the issue
    - Fix for link usage check box issue in Network page
        - Thanks to Heechang Na @OSC for reporting the issue
    - Fix for various issues related to port counters, process counters
        - Thanks to Heechang Na @OSC for reporting the issue
    - Fix for issues related to job-level, node-level and process-level CPU,
      Virtual Memory and Communication Buffer utilization in live job page

For downloading OSU INAM v0.9.5 and associated user guide, please visit the
following URL:

http://mvapich.cse.ohio-state.edu

All questions, feedback, bug reports, hints for performance tuning, and
enhancements are welcome. Please post it to the mvapich-discuss mailing list
(mvapich-discuss at cse.ohio-state.edu).

Thanks,

The MVAPICH Team

PS: We are also happy to inform that the number of organizations using
MVAPICH2 libraries (and registered at the MVAPICH site) has crossed
3,050 worldwide (in 89 countries). The number of downloads from the
MVAPICH site has crossed 630,000 (0.6 million).  The MVAPICH team
would like to thank all its users and organizations!!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20191219/72955cb1/attachment.html>


More information about the mvapich-discuss mailing list