[mvapich-discuss] MPI_Finalize cause segment fault

Akshay Venkatesh akshay at cse.ohio-state.edu
Wed Aug 24 11:53:56 EDT 2016


Hi,

I tried to reproduce the error that you gave. There are some items missing
from the attachments, though. When I tried to compile, mpicc complained
about missing header files (appended in the email). It would be useful if
you could provide:
 1. Makefile with the required headers.
 2. The mpirun* command needed to reproduce the error
 3. The config flags that you had used to build the library.

[akshay at ivy1 bugs]$ mpicc RemoteAssistant.cpp Stream.cpp -o
RemoteAssistant.out
RemoteAssistant.cpp:8:29: fatal error: RemoteAssistant.h: No such file or
directory
 #include "RemoteAssistant.h"
                             ^
compilation terminated.
Stream.cpp:8:20: fatal error: Stream.h: No such file or directory
 #include "Stream.h"
                    ^
compilation terminated.

On Tue, Aug 16, 2016 at 11:13 AM, Hari Subramoni <subramoni.1 at osu.edu>
wrote:

> Hello,
>
> Sorry to hear that you're facing an issue with MVAPICH2. If possible, can
> you share your test program with us? In the mean time, can you try running
> your program after setting MV2_USE_INDEXED_TUNING=0?
>
> Thx,
> Hari.
>
> On Tue, Aug 16, 2016 at 11:01 AM, 吴雪 <sy1406125 at buaa.edu.cn> wrote:
>
>> Hi,
>> I've met a problem.My program terminated with signal SIGSEGV,
>> Segmentation fault.The error information is :
>> [gpu-cluster-2:mpi_rank_0][error_sighandler] Caught error: Segmentation
>> fault (signal 11)
>>
>> ============================================================
>> =======================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   PID 23147 RUNNING AT 192.168.2.2
>> =   EXIT CODE: 139
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> ============================================================
>> =======================
>> [proxy:1:0 at gpu-cluster-1] HYD_pmcd_pmip_control_cmd_cb
>> (pm/pmiserv/pmip_cb.c:912): assert (!closed) failed
>> [proxy:1:0 at gpu-cluster-1] HYDT_dmxu_poll_wait_for_event
>> (tools/demux/demux_poll.c:76): callback returned error status
>> [proxy:1:0 at gpu-cluster-1] main (pm/pmiserv/pmip.c:206): demux engine
>> error waiting for event
>> [mpiexec at gpu-cluster-2] HYDT_bscu_wait_for_completion
>> (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated
>> badly; aborting
>> [mpiexec at gpu-cluster-2] HYDT_bsci_wait_for_completion
>> (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting
>> for completion
>> [mpiexec at gpu-cluster-2] HYD_pmci_wait_for_completion
>> (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for
>> completion
>> [mpiexec at gpu-cluster-2] main (ui/mpich/mpiexec.c:344): process manager
>> error waiting for completion
>>
>>
>> and the gdb backtrace information is:
>> #0  0x00007fb82b7b7613 in _int_free () from /home/run/wx-workplace/mvapich
>> 2-2.1/lib/libmpi.so.12
>> #1  0x00007fb82b7b7b1b in free () from /home/run/wx-workplace/mvapich
>> 2-2.1/lib/libmpi.so.12
>> #2  0x00007fb82b69075d in MV2_cleanup_gather_tuning_table () from
>> /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
>> #3  0x00007fb82b5568b7 in MV2_collectives_arch_finalize () from
>> /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
>> #4  0x00007fb82b768df7 in MPIDI_CH3_Finalize () from
>> /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
>> #5  0x00007fb82b75e49b in MPID_Finalize () from
>> /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
>> #6  0x00007fb82b6e8037 in PMPI_Finalize () from
>> /home/run/wx-workplace/mvapich2-2.1/lib/libmpi.so.12
>> #7  0x00007fb82bd4b6e4 in cudaRemoteFinalize () from ./libcudart_remote.so
>> #8  0x00007fb82bd505db in GC_InitStruct::~GC_InitStruct() () from
>> ./libcudart_remote.so
>> #9  0x00007fb82adef4da in __cxa_finalize () from
>> /lib/x86_64-linux-gnu/libc.so.6
>> #10 0x00007fb82bd4a723 in __do_global_dtors_aux () from
>> ./libcudart_remote.so
>> #11 0x00007fffc59facb0 in ?? ()
>>
>> In my program,I use MPI_Comm_spawn to start several child programs.And
>> the father and children use MPI_Isend,MPI_Irecv,MPI_Recv_init,MPI_Start,MPI_Wait.I've
>> not been able to find out what causes segment fault.Or what does
>> MPI_Finalize do?
>>
>> Looking forward to your reply.
>>
>> Thanks
>> xue
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


-- 
- Akshay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160824/05efafae/attachment-0001.html>


More information about the mvapich-discuss mailing list