[mvapich-discuss] mvapich2 debug/verbose mode?

Belgin, Mehmet mehmet.belgin at oit.gatech.edu
Mon Dec 19 17:59:52 EST 2016


Dear all,

We are trying to troubleshoot an application that crashes only with large number of cores (>1024) and after more than a day of runtime, which makes the process very difficult. Unfortunately we can't get a lot of insight from the error messages mvapich2 is generating. We are interested in things like which nodes pair(s) were involved in the failed message passing operation (similar to OpenMPI's crash messages).

I've been looking at runtime options and found "MV2_DEBUG_SHOW_BACKTRACE". What other parameters would you recommend to maximize the verbosity of mvapich2 to see if we can catch any clues? We are using intel/15.0 with mvapich2/2.1, and the usual hydra launcher.

I'll appreciate any suggestions you may have.

Thank you in advance and happy holidays,

-Mehmet

=========================================
Mehmet Belgin, Ph.D.
Scientific Computing Consultant
Partnership for an Advanced Computing Environment (PACE)
Georgia Institute of Technology
258 4th Street NW, Rich Building, #326
Atlanta, GA  30332-0700
Office: (404) 385-0665



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20161219/1230ed20/attachment.html>


More information about the mvapich-discuss mailing list