[mvapich-discuss] multicast difficulties

Devendar Bureddy bureddy at cse.ohio-state.edu
Thu Jan 31 14:04:18 EST 2013


Hi Martin

On Thu, Jan 31, 2013 at 1:06 PM, Martin Pokorny <mpokorny at nrao.edu> wrote:
> Hello everyone.
>
> I'm having some difficulties getting multicast to work my system. I don't
> have a lot of experience with Infiniband, so I've probably got something
> misconfigured, but I only see the problem whenever I try to run any program
> linked to the mvapich2 libraries, so I thought I'd ask here first. I'm using
> mvapich2-1.9a2 on a cluster running RHEL 6.3, with Mellanox MT26428 HCAs.
> mvapich2 was built with the following configure options:
>
>> ./configure --prefix=/opt/cbe-local/stow/mvapich2-1.9a2 --enable-romio
>> --with-file-system=lustre --enable-shared --enable-sharedlibs=gcc
>> --with-rdma-cm --enable-fast=O3 --with-limic2 --enable-g=dbg,log
>
>
> When I run a program with MV2_USE_MCAST=1 and MV2_USE_RDMA_CM=1 it fails
> with output like the following for all nodes:
>
>> Failed to modify QP to INIT
>> Error in creating UD QP
>> [cbe-node-11:mpi_rank_2][mv2_mcast_prepare_ud_ctx] MCAST UD QP creation
>> failed[cbe-node-11:mpi_rank_2][MPIDI_CH3_Init] Error in create multicast UD
>> context for multicast

We do not have MCAST support with RDMA_CM connection management. So,
it is expected to fail with these two parameters together.


>
>
> When I run the same program with only MV2_USE_MCAST=1, a segfault occurs,
> with the following backtrace (obtained using MV2_DEBUG_SHOW_BACKTRACE):
>
>> [cbe-node-09:mpi_rank_0][print_backtrace]   0:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(print_backtrace+0x1e)
>> [0x7f99d1089f5e]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   1:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(error_sighandler+0x59)
>> [0x7f99d108a069]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   2: /lib64/libpthread.so.0()
>> [0x355840f500]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   3:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(MPIDI_CH3I_MRAILI_Eager_send+0x2de)
>> [0x7f99d1050e1e]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   4:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(MPIDI_CH3_iStartMsg+0x27c)
>> [0x7f99d1039a1c]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   5:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(+0x13c55d)
>> [0x7f99d108655d]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   6:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(mv2_process_mcast_msg+0xfc)
>> [0x7f99d108680c]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   7:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(MPIDI_CH3I_MRAILI_Cq_poll+0x11fc)
>> [0x7f99d10661cc]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   8:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(MPIDI_CH3I_read_progress+0x18f)
>> [0x7f99d103d07f]
>> [cbe-node-09:mpi_rank_0][print_backtrace]   9:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(MPIDI_CH3I_Progress+0x13a)
>> [0x7f99d103c55a]
>> [cbe-node-09:mpi_rank_0][print_backtrace]  10:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(mv2_mcast_progress_comm_ready+0x69)
>> [0x7f99d1086db9]
>> [cbe-node-09:mpi_rank_0][print_backtrace]  11:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(create_2level_comm+0x1109)
>> [0x7f99d1154259]
>> [cbe-node-09:mpi_rank_0][print_backtrace]  12:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(MPIR_Init_thread+0x4d8)
>> [0x7f99d1177f18]
>> [cbe-node-09:mpi_rank_0][print_backtrace]  13:
>> /opt/cbe-local/stow/mvapich2-1.9a2/lib/libmpich.so.8(MPI_Init+0xdb)
>> [0x7f99d117727b]
>> [cbe-node-09:mpi_rank_0][print_backtrace]  14: ./bdfsim1() [0x403cee]
>> [cbe-node-09:mpi_rank_0][print_backtrace]  15:
>> /lib64/libc.so.6(__libc_start_main+0xfd) [0x355801ecdd]
>> [cbe-node-09:mpi_rank_0][print_backtrace]  16: ./bdfsim1() [0x401949]
>

It seems, there is some discrepancy in the backtrace function names.
This could be because of -O3 optimization flag with --enable-fast. Can
you try with a build enabled with  "--enable-fast=none" and report the
backtrace .

There are few prerequisites to use IB mcast feature with mvpaich2
    - OpenSM should be running  on the cluster
    - user should have "rw" permissions on UMAD device ( /dev/infiniband/umad0)
    - multicast feature is enabled only when running with more that
MV2_MCAST_NUM_NODES_THRESHOLD ( default :8 ) nodes

Can you check if any of these causing the issue?

>
> The program runs correctly with only MV2_USE_RDMA_CM=1.

In this case, program is not using any MCAST feature, so it is
expected to run fine.

-- 
Devendar


>
> --
> Martin Pokorny
> Software Engineer - Karl G. Jansky Very Large Array
> National Radio Astronomy Observatory - New Mexico Operations
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


More information about the mvapich-discuss mailing list