[mvapich-discuss] MPI_Finalize cause segment fault
吴雪
sy1406125 at buaa.edu.cn
Tue Aug 16 11:01:01 EDT 2016
Hi,
I've met a problem.My program terminated with signal SIGSEGV, Segmentation fault.The error information is :
[gpu-cluster-2:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 23147 RUNNING AT 192.168.2.2
= EXIT CODE: 139
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:1:0 at gpu-cluster-1] HYD_pmcd_pmip_control_cmd_cb (pm/pmiserv/pmip_cb.c:912): assert (!closed) failed
[proxy:1:0 at gpu-cluster-1] HYDT_dmxu_poll_wait_for_event (tools/demux/demux_poll.c:76): callback returned error status
[proxy:1:0 at gpu-cluster-1] main (pm/pmiserv/pmip.c:206): demux engine error waiting for event
[mpiexec at gpu-cluster-2] HYDT_bscu_wait_for_completion (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated badly; aborting
[mpiexec at gpu-cluster-2] HYDT_bsci_wait_for_completion (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
[mpiexec at gpu-cluster-2] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for completion
[mpiexec at gpu-cluster-2] main (ui/mpich/mpiexec.c:344): process manager error waiting for completion
and the gdb backtrace information is:
#0 0x00007fb82b7b7613 in _int_free () from /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
#1 0x00007fb82b7b7b1b in free () from /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
#2 0x00007fb82b69075d in MV2_cleanup_gather_tuning_table () from /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
#3 0x00007fb82b5568b7 in MV2_collectives_arch_finalize () from /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
#4 0x00007fb82b768df7 in MPIDI_CH3_Finalize () from /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
#5 0x00007fb82b75e49b in MPID_Finalize () from /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
#6 0x00007fb82b6e8037 in PMPI_Finalize () from /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
#7 0x00007fb82bd4b6e4 in cudaRemoteFinalize () from ./libcudart_remote.so
#8 0x00007fb82bd505db in GC_InitStruct::~GC_InitStruct() () from ./libcudart_remote.so
#9 0x00007fb82adef4da in __cxa_finalize () from /lib/x86_64-linux-gnu/libc.so.6
#10 0x00007fb82bd4a723 in __do_global_dtors_aux () from ./libcudart_remote.so
#11 0x00007fffc59facb0 in ?? ()
In my program,I use MPI_Comm_spawn to start several child programs.And the father and children use MPI_Isend,MPI_Irecv,MPI_Recv_init,MPI_Start,MPI_Wait.I've not been able to find out what causes segment fault.Or what does MPI_Finalize do?
Looking forward to your reply.
Thanks
xue
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160816/05ad8162/attachment.html>
More information about the mvapich-discuss
mailing list