[mvapich-discuss] Segmentation fault in MPI_Reduce using MPI_IN_PLACE and root != 0

Hari Subramoni subramoni.1 at osu.edu
Tue May 19 22:23:53 EDT 2015


Dear Markus,

I see an inconsistancy between the build you are using and the system on
which you are running with that build.

You've built MVAPICH2 using the default configurations which creates a
build for the OpenFabrics (OFA) IB/iWARP/RoCE based systems. However, the
output of ibv_devinfo indicates that you are running on system with QLogic
InfiniPath HCAs. The QLogic PSM interface needs to be selected at configure
time to use MVAPICH2 on InfiniPath adapters.

Please refer the following link for more information about how to configure
MVAPICH2 for QLogic PSM interface

http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.1-userguide.html#x1-160004.8

Could you please re-build MVAPICH2 as recommended in the above link and try
your use case again?

On a different note, our developer team informs me that you need not use
both "--enable-g=dbg --disable-fast". Just "--enable-g=dbg" will suffice.

Please let us know if you face any further issues and we will be happy to
help.

Best Regards,
Hari.

On Tue, May 19, 2015 at 10:55 AM, Markus Geimer <mg1102 at web.de> wrote:

> Dear MVAPICH developers,
>
> While testing my MPI application with MVAPICH2 2.1, I ran
> into a segmentation fault.  After an in-depth investigation
> I could boil down the issue to the attached minimal example.
>
> Calling MPI_Reduce using MPI_IN_PLACE and root != 0 multiple
> times causes the segfault.  The example works for me with
> ITER <= 16, but fails with ITER >= 17.  If root == 0, it
> works regardless of the number of iterations. Note that the
> example also works fine with MPICH 3.1 and Open MPI 1.8.3.
>
> The test system is running Debian 7 with kernel 3.2.65; the
> compiler used is GCC 4.8.3 (built from vanilla source).
> Please find some more detailed information below.  If you
> need more -- or have a patch for me to try out -- please let
> me know.
>
> Thanks,
> Markus
>
> ----- 8< ----- 8< ----- 8< ----- 8< ----- 8< ----- 8< -----
>
> $ mpiexec -n 2 ./a.out
> [host:mpi_rank_1][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> [host:mpi_rank_1][print_backtrace]   0:
> /opt/mvapich2-2.1/lib/libmpi.so.12(print_backtrace+0x20) [0x7f553b6b404c]
> [host:mpi_rank_1][print_backtrace]   1:
> /opt/mvapich2-2.1/lib/libmpi.so.12(error_sighandler+0x77) [0x7f553b6b419e]
> [host:mpi_rank_1][print_backtrace]   2:
> /lib/x86_64-linux-gnu/libc.so.6(+0x321e0) [0x7f553ae331e0]
> [host:mpi_rank_1][print_backtrace]   3:
> /lib/x86_64-linux-gnu/libc.so.6(+0x12abd0) [0x7f553af2bbd0]
> [host:mpi_rank_1][print_backtrace]   4:
> /opt/mvapich2-2.1/lib/libmpi.so.12(+0x3f6857) [0x7f553b582857]
> [host:mpi_rank_1][print_backtrace]   5:
> /opt/mvapich2-2.1/lib/libmpi.so.12(MPIR_Localcopy+0x357) [0x7f553b5833b4]
> [host:mpi_rank_1][print_backtrace]   6:
> /opt/mvapich2-2.1/lib/libmpi.so.12(MPIR_Reduce_shmem_MV2+0x385)
> [0x7f553b2a3d50]
> [host:mpi_rank_1][print_backtrace]   7:
> /opt/mvapich2-2.1/lib/libmpi.so.12(MPIR_Reduce_two_level_helper_MV2+0x594)
> [0x7f553b2a57ec]
> [host:mpi_rank_1][print_backtrace]   8:
> /opt/mvapich2-2.1/lib/libmpi.so.12(MPIR_Reduce_index_tuned_intra_MV2+0x97b)
> [0x7f553b2a6af3]
> [host:mpi_rank_1][print_backtrace]   9:
> /opt/mvapich2-2.1/lib/libmpi.so.12(MPIR_Reduce_MV2+0x9f) [0x7f553b2a6ea0]
> [host:mpi_rank_1][print_backtrace]  10:
> /opt/mvapich2-2.1/lib/libmpi.so.12(MPIR_Reduce_impl+0x8c) [0x7f553b21534b]
> [host:mpi_rank_1][print_backtrace]  11:
> /opt/mvapich2-2.1/lib/libmpi.so.12(PMPI_Reduce+0x1709) [0x7f553b216b9b]
> [host:mpi_rank_1][print_backtrace]  12: ./a.out() [0x400838]
> [host:mpi_rank_1][print_backtrace]  13:
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xfd) [0x7f553ae1fead]
> [host:mpi_rank_1][print_backtrace]  14: ./a.out() [0x4006c9]
>
> $ mpiname -a
> MVAPICH2 2.1 Fri Apr 03 20:00:00 EDT 2015 ch3:mrail
>
> Compilation
> CC: gcc    -g -O0
> CXX: g++   -g -O0
> F77: gfortran -L/lib -L/lib   -g -O0
> FC: gfortran   -g -O0
>
> Configuration
> --prefix=/opt/mvapich2-2.1 --enable-shared --enable-g=dbg --disable-fast
> --enable-debuginfo
>
> $ ibv_devinfo
> hca_id: qib0
>         transport:                      InfiniBand (0)
>         fw_ver:                         0.0.0
>         node_guid:                      0011:7500:00ff:d76b
>         sys_image_guid:                 0011:7500:00ff:d76b
>         vendor_id:                      0x1175
>         vendor_part_id:                 16
>         hw_ver:                         0x2
>         board_id:                       InfiniPath_QLE7140
>         phys_port_cnt:                  1
>                 port:   1
>                         state:                  PORT_ACTIVE (4)
>                         max_mtu:                4096 (5)
>                         active_mtu:             2048 (4)
>                         sm_lid:                 1
>                         port_lid:               1
>                         port_lmc:               0x00
>                         link_layer:             InfiniBand
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150519/937d9925/attachment.html>


More information about the mvapich-discuss mailing list