[mvapich-discuss] segmentation fault

Hoot Thompson hoot at ptpnow.com
Tue Jun 19 07:15:54 EDT 2012


Here you go.......

[root at penguin1-vm1 mvapich2-1.8-r5435]# mpiexec -n 2 -hosts 
10.10.10.1,10.10.10.2 /root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw
[penguin1-vm1:mpi_rank_0][error_sighandler] Caught error: Segmentation 
fault (signal 11)
[pengui2-vm1:mpi_rank_1][error_sighandler] Caught error: Segmentation 
fault (signal 11)
[pengui2-vm1:mpi_rank_1][print_backtrace]   0: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x5435b6]
[pengui2-vm1:mpi_rank_1][print_backtrace]   1: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x5436f2]
[pengui2-vm1:mpi_rank_1][print_backtrace]   2: /lib64/libpthread.so.0() 
[0x3ea940f4a0]
[pengui2-vm1:mpi_rank_1][print_backtrace]   3: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4664b5]
[pengui2-vm1:mpi_rank_1][print_backtrace]   4: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4681e7]
[pengui2-vm1:mpi_rank_1][print_backtrace]   5: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x468a4a]
[pengui2-vm1:mpi_rank_1][print_backtrace]   6: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x43401c]
[pengui2-vm1:mpi_rank_1][print_backtrace]   7: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x544c8c]
[pengui2-vm1:mpi_rank_1][print_backtrace]   8: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x49fc9b]
[pengui2-vm1:mpi_rank_1][print_backtrace]   9: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x40daaa]
[pengui2-vm1:mpi_rank_1][print_backtrace]  10: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x40d3ab]
[pengui2-vm1:mpi_rank_1][print_backtrace]  11: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4051ca]
[pengui2-vm1:mpi_rank_1][print_backtrace]  12: 
/lib64/libc.so.6(__libc_start_main+0xfd) [0x3ea901ecdd]
[pengui2-vm1:mpi_rank_1][print_backtrace]  13: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4050b9]
[penguin1-vm1:mpi_rank_0][print_backtrace]   0: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x5435b6]
[penguin1-vm1:mpi_rank_0][print_backtrace]   1: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x5436f2]
[penguin1-vm1:mpi_rank_0][print_backtrace]   2: /lib64/libpthread.so.0() 
[0x313d80f4a0]
[penguin1-vm1:mpi_rank_0][print_backtrace]   3: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4664b5]
[penguin1-vm1:mpi_rank_0][print_backtrace]   4: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4681e7]
[penguin1-vm1:mpi_rank_0][print_backtrace]   5: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x468a4a]
[penguin1-vm1:mpi_rank_0][print_backtrace]   6: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x43401c]
[penguin1-vm1:mpi_rank_0][print_backtrace]   7: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x544c8c]
[penguin1-vm1:mpi_rank_0][print_backtrace]   8: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x49fc9b]
[penguin1-vm1:mpi_rank_0][print_backtrace]   9: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x40daaa]
[penguin1-vm1:mpi_rank_0][print_backtrace]  10: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x40d3ab]
[penguin1-vm1:mpi_rank_0][print_backtrace]  11: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4051ca]
[penguin1-vm1:mpi_rank_0][print_backtrace]  12: 
/lib64/libc.so.6(__libc_start_main+0xfd) [0x313d41ecdd]
[penguin1-vm1:mpi_rank_0][print_backtrace]  13: 
/root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw() [0x4050b9]

=====================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 139
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
[proxy:0:1 at pengui2-vm1] HYD_pmcd_pmip_control_cmd_cb 
(./pm/pmiserv/pmip_cb.c:955): assert (!closed) failed
[proxy:0:1 at pengui2-vm1] HYDT_dmxu_poll_wait_for_event 
(./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:1 at pengui2-vm1] main (./pm/pmiserv/pmip.c:226): demux engine 
error waiting for event
APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)












On 06/18/2012 09:41 PM, Jonathan Perkins wrote:
> On Mon, Jun 18, 2012 at 03:50:06PM -0400, Hoot Thompson wrote:
>> A little background, I've been working with Mellanox to get SR-IOV
>> working between two Virtual Machines (VM). As of today, I have two
>> real machines each with a VM and a virtual IB connection up between
>> them. Logged into one of the VMs, I can ping and run the rdma_bw and
>> rdma_lat tests between the VMs just fine. Attempts to run osu_bw
>> (compiled with the Intel compiler), fails with the
>> following................
>>
>>
>> [root at penguin1-vm1 mvapich2-1.8-r5435]# mpiexec -n 2 -hosts
>> 10.10.10.1,10.10.10.2
>> /root/osu/mvapich2-1.8-r5435/osu_benchmarks/osu_bw
>> [penguin1-vm1:mpi_rank_0][error_sighandler] Caught error:
>> Segmentation fault (signal 11)
>> [pengui2-vm1:mpi_rank_1][error_sighandler] Caught error:
>> Segmentation fault (signal 11)
>>
>> =====================================================================================
>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>> =   EXIT CODE: 139
>> =   CLEANING UP REMAINING PROCESSES
>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>> =====================================================================================
>> [proxy:0:1 at pengui2-vm1] HYD_pmcd_pmip_control_cmd_cb
>> (./pm/pmiserv/pmip_cb.c:955): assert (!closed) failed
>> [proxy:0:1 at pengui2-vm1] HYDT_dmxu_poll_wait_for_event
>> (./tools/demux/demux_poll.c:77): callback returned error status
>> [proxy:0:1 at pengui2-vm1] main (./pm/pmiserv/pmip.c:226): demux engine
>> error waiting for event
>> APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
>>
>>
>> Any thoughts?
> Using a debug of MVAPICH2 (built with --disable-fast --enable-g=dbg)
> please set the environment variable MV2_DEBUG_CORESIZE=unlimited or
> MV2_DEBUG_SHOW_BACKTRACE=1 so that we can get more information about
> why its segfaulting.
>
> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.8.html#x1-1120009.1.10
>



More information about the mvapich-discuss mailing list