[mvapich-discuss] Trying to reproduce the checkpointing benchmarking result -- Segmentation fault

Harsh Hemani harshscience777 at gmail.com
Wed Jan 14 04:33:26 EST 2015


I am trying to reproduce the results on this page:
http://mvapich.cse.ohio-state.edu/performance/checkpointing/

I have set up a 2 node cluster with exact same specifications listed on the
page.
When I try to run on single machine, i do not get any output.
When I run it on 2 machines, I get a seg-fault saying:

*mpirun_rsh opening file /home/ckpt/.2.auto*
*[node1:mpi_rank_11][error_sighandler] Caught error: Segmentation fault
(signal 11)*

Software env details:
MVAPICH2 version: MVAPICH2-2.1a
BLCR version        : BLCR 0.8.5 -- all tests successfully passed
Linux Kernel          : 2.6.32-431.el6.x86_64 (Scientific Linux 6.5)
OFED version       : Mellanox OFED 2.2-1.0.1 for RHEL/CentOS 6.5
SELinux and iptables are disabled on both the machines.

What could i be missing here?


Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150114/d14eadf8/attachment.html>


More information about the mvapich-discuss mailing list