[mvapich-discuss] segmentation fault (signal 11), Exit code 139Ok

Jonathan Perkins perkinjo at cse.ohio-state.edu
Fri Aug 17 16:05:34 EDT 2012


I'm glad that you nailed down the problem.  Let us know if you face any
further issues.

On Fri, Aug 17, 2012 at 03:52:42PM -0400, Hoot Thompson wrote:
> So I think I figured it out. I was an /etc/hosts issue.
> 
> Thanks as always,
> 
> hoot
> 
> -----Original Message-----
> From: Jonathan Perkins [mailto:perkinjo at cse.ohio-state.edu] 
> Sent: Friday, August 17, 2012 12:12 PM
> To: Hoot Thompson
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: Re: [mvapich-discuss] segmentation fault (signal 11), Exit code 139Ok
> 
> If you're using mpirun_rsh
>     mpirun_rsh -n 2 MV2_DEBUG_SHOW_BACKTRACE=1 mpiprogram
> 
> If you're using mpiexec
>     mpiexec -n 2 -env MV2_DEBUG_SHOW_BACKTRACE 1 mpiprogram
> 
> On Fri, Aug 17, 2012 at 11:59:41AM -0400, Hoot Thompson wrote:
> > Ok, so I recompiled with the debug flags but didn't get any additional error
> > info. Where do I look for the info? How do I invoke
> > MV2_DEBUG_SHOW_BACKTRACE?
> > 
> > 
> > -----Original Message-----
> > From: Jonathan Perkins [mailto:perkinjo at cse.ohio-state.edu] 
> > Sent: Friday, August 17, 2012 11:14 AM
> > To: Hoot Thompson
> > Cc: mvapich-discuss at cse.ohio-state.edu
> > Subject: Re: [mvapich-discuss] segmentation fault (signal 11), Exit code 139
> > 
> > On Fri, Aug 17, 2012 at 10:33:55AM -0400, Hoot Thompson wrote:
> > > I have new cluster that I’ve configured in a manner similar to other
> > > systems. Getting the following error when running between nodes, works
> > fine
> > > when running on same node (either of two).
> > 
> > Can you tell us a little bit about the architecture of systems as well
> > as the software environment (such as OS and any schedulers in use).
> > 
> > I also suggest taking a look at
> > https://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.8.html#x1-1
> > 120009.1.10
> > 
> > Try using the MV2_DEBUG_SHOW_BACKTRACE parameter to see if you get more
> > output.  Also when doing your debug build, use --disable-fast in
> > addition to --enable-g=dbg.
> > 
> > > 
> > > Hoot
> > > 
> > > 
> > > [root at mas-nn-ib ~]#  /usr/local/other/mvapich2/bin/mpirun -n 2 -hosts
> > > mas-nn-ib,mas-dn1-ib /usr/local/other/benchmarks/osu_benchmarks/osu_bw
> > > [mas-dn01-ib:mpi_rank_1][error_sighandler] Caught error: Segmentation
> > fault
> > > (signal 11)
> > > 
> > >
> > ============================================================================
> > > =========
> > > =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> > > =   EXIT CODE: 139
> > > =   CLEANING UP REMAINING PROCESSES
> > > =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> > >
> > ============================================================================
> > > =========
> > > [proxy:0:0 at mas-nn-ib] HYD_pmcd_pmip_control_cmd_cb
> > > (./pm/pmiserv/pmip_cb.c:955): assert (!closed) failed
> > > [proxy:0:0 at mas-nn-ib] HYDT_dmxu_poll_wait_for_event
> > > (./tools/demux/demux_poll.c:77): callback returned error status
> > > [proxy:0:0 at mas-nn-ib] main (./pm/pmiserv/pmip.c:226): demux engine error
> > > waiting for event
> > > [mpiexec at mas-nn-ib] HYDT_bscu_wait_for_completion
> > > (./tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated
> > > badly; aborting
> > > [mpiexec at mas-nn-ib] HYDT_bsci_wait_for_completion
> > > (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting
> > for
> > > completion
> > > [mpiexec at mas-nn-ib] HYD_pmci_wait_for_completion
> > > (./pm/pmiserv/pmiserv_pmci.c:191): launcher returned error waiting for
> > > completion
> > > [mpiexec at mas-nn-ib] main (./ui/mpich/mpiexec.c:405): process manager error
> > > waiting for completion
> > > 
> > > 
> > > 
> > > [root at mas-nn-ib ~]#  /usr/local/other/mvapich2/bin/mpirun -n 2 -hosts
> > > mas-nn-ib,mas-nn-ib /usr/local/other/benchmarks/osu_benchmarks/osu_bw
> > > # OSU MPI Bandwidth Test v3.6
> > > # Size      Bandwidth (MB/s)
> > > 1                       2.57
> > > 2                       5.20
> > > 4                      10.40
> > > 8                      20.71
> > > 16                     41.34
> > > 32                     82.84
> > > 64                    164.62
> > > 128                   315.43
> > > 256                   586.19
> > > 512                  1010.73
> > > 1024                 1576.86
> > > 2048                 2350.19
> > > 4096                 3180.26
> > > 8192                 3839.28
> > > 16384                4255.49
> > > 32768                3043.95
> > > 65536                3717.39
> > > 131072               3869.20
> > > 262144               3585.87
> > > 524288               3563.36
> > > 1048576              7079.52
> > > 2097152              9921.37
> > > 4194304              9929.68
> > > 
> > > 
> > > [root at mas-nn-ib ~]#  /usr/local/other/mvapich2/bin/mpirun -n 2 -hosts
> > > mas-dn1-ib,mas-dn1-ib /usr/local/other/benchmarks/osu_benchmarks/osu_bw
> > > # OSU MPI Bandwidth Test v3.6
> > > # Size      Bandwidth (MB/s)
> > > 1                       2.59
> > > 2                       5.22
> > > 4                      10.44
> > > 8                      20.85
> > > 16                     41.98
> > > 32                     82.87
> > > 64                    164.76
> > > 128                   320.40
> > > 256                   591.03
> > > 512                   996.87
> > > 1024                 1555.53
> > > 2048                 2336.40
> > > 4096                 3181.73
> > > 8192                 3840.08
> > > 16384                4256.52
> > > 32768                3220.42
> > > 65536                3635.69
> > > 131072               3855.15
> > > 262144               3562.69
> > > 524288               3564.40
> > > 1048576              9877.30
> > > 2097152              9901.89
> > > 4194304              9920.58
> > > 
> > > 
> > 
> > > _______________________________________________
> > > mvapich-discuss mailing list
> > > mvapich-discuss at cse.ohio-state.edu
> > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > 
> > 
> > -- 
> > Jonathan Perkins
> > http://www.cse.ohio-state.edu/~perkinjo
> > 
> > 
> > 
> 
> -- 
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo
> 
> 
> 

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


More information about the mvapich-discuss mailing list