[mvapich-discuss] segmentation fault (signal 11), Exit code 139

Hoot Thompson hoot at ptpnow.com
Fri Aug 17 10:33:55 EDT 2012


I have new cluster that I’ve configured in a manner similar to other
systems. Getting the following error when running between nodes, works fine
when running on same node (either of two).

Hoot


[root at mas-nn-ib ~]#  /usr/local/other/mvapich2/bin/mpirun -n 2 -hosts
mas-nn-ib,mas-dn1-ib /usr/local/other/benchmarks/osu_benchmarks/osu_bw
[mas-dn01-ib:mpi_rank_1][error_sighandler] Caught error: Segmentation fault
(signal 11)

============================================================================
=========
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 139
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
============================================================================
=========
[proxy:0:0 at mas-nn-ib] HYD_pmcd_pmip_control_cmd_cb
(./pm/pmiserv/pmip_cb.c:955): assert (!closed) failed
[proxy:0:0 at mas-nn-ib] HYDT_dmxu_poll_wait_for_event
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0 at mas-nn-ib] main (./pm/pmiserv/pmip.c:226): demux engine error
waiting for event
[mpiexec at mas-nn-ib] HYDT_bscu_wait_for_completion
(./tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated
badly; aborting
[mpiexec at mas-nn-ib] HYDT_bsci_wait_for_completion
(./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for
completion
[mpiexec at mas-nn-ib] HYD_pmci_wait_for_completion
(./pm/pmiserv/pmiserv_pmci.c:191): launcher returned error waiting for
completion
[mpiexec at mas-nn-ib] main (./ui/mpich/mpiexec.c:405): process manager error
waiting for completion



[root at mas-nn-ib ~]#  /usr/local/other/mvapich2/bin/mpirun -n 2 -hosts
mas-nn-ib,mas-nn-ib /usr/local/other/benchmarks/osu_benchmarks/osu_bw
# OSU MPI Bandwidth Test v3.6
# Size      Bandwidth (MB/s)
1                       2.57
2                       5.20
4                      10.40
8                      20.71
16                     41.34
32                     82.84
64                    164.62
128                   315.43
256                   586.19
512                  1010.73
1024                 1576.86
2048                 2350.19
4096                 3180.26
8192                 3839.28
16384                4255.49
32768                3043.95
65536                3717.39
131072               3869.20
262144               3585.87
524288               3563.36
1048576              7079.52
2097152              9921.37
4194304              9929.68


[root at mas-nn-ib ~]#  /usr/local/other/mvapich2/bin/mpirun -n 2 -hosts
mas-dn1-ib,mas-dn1-ib /usr/local/other/benchmarks/osu_benchmarks/osu_bw
# OSU MPI Bandwidth Test v3.6
# Size      Bandwidth (MB/s)
1                       2.59
2                       5.22
4                      10.44
8                      20.85
16                     41.98
32                     82.87
64                    164.76
128                   320.40
256                   591.03
512                   996.87
1024                 1555.53
2048                 2336.40
4096                 3181.73
8192                 3840.08
16384                4256.52
32768                3220.42
65536                3635.69
131072               3855.15
262144               3562.69
524288               3564.40
1048576              9877.30
2097152              9901.89
4194304              9920.58


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120817/16d6c74e/attachment.html


More information about the mvapich-discuss mailing list