[mvapich-discuss] Problem with MVAPICH2 (1.9, r6338) on InfiniBand

Steven Vancoillie steven.vancoillie at gmail.com
Mon Jul 8 12:21:05 EDT 2013


Hi,

when running an application (Global Arrays test suite) on top of
MVAPICH2 (via ARMCI-MPI), I get the following error:

[0->3] send desc error, wc_opcode=0
[0->3] wc.status=12, wc.wr_id=0x6cf5c8, wc.opcode=0, vbuf->phead->type=54
= MPIDI_CH3_PKT_CLOSE
[r5i1n6:mpi_rank_0][MPIDI_CH3I_MRAILI_Cq_poll]
src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:586: [] Got
completion with error 12, vendor code=0x81, dest rank=3

I would be grateful if someone can help me resolve this issue.

The is the output of mpiexec --version:

HYDRA build details:
    Version:                                 3.0.3
    Release Date:                            unreleased development copy
    CC:                              icc
    CXX:                             icpc
    F77:                             ifort
    F90:                             ifort
    Configure options:
'--disable-option-checking' '--prefix=/apps/leuven/mvapich2/1.9_intel'
'CC=icc' 'CXX=icpc' 'F77=ifort' '--enable-f77' '--enable-fc'
'--enable-cxx' '--enable-romio' '--enable-debuginfo' '--enable-mpe'
'--enable-shared' '--without-ftb' '--with-mpe' '--disable-ckpt'
'--disable-mcast' '--disable-checkerrors' '--enable-embedded-mode'
'--cache-file=/dev/null' '--srcdir=.' 'CFLAGS= -DNDEBUG -DNVALGRIND
-O2' 'LDFLAGS=-L/lib -L/lib -Wl,-rpath,/lib -L/lib' 'LIBS=-libumad
-libverbs -lrt -lnuma -lpthread ' 'CPPFLAGS=
-I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/mpl/include
-I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/mpl/include
-I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/openpa/src
-I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/openpa/src
-I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/mpi/romio/include
-I/include -I/include'
    Process Manager:                         pmi
    Launchers available:                     ssh rsh fork slurm ll lsf
sge manual persist
    Topology libraries available:            hwloc
    Resource management kernels available:   user slurm ll lsf sge pbs cobalt
    Checkpointing libraries available:
    Demux engines available:                 poll select

Furthermore, someone suggested to me to build MVAPICH2 without mrail
support: "you should probably use the default build of mvapich instead
of the mrail build, unless you want to use multiple network lanes
simultaneously". Unfortunately, I can't really understand from the
user guide how I should do this, or even what multiple rail means. Is
there some online documentation that addresses this or how I should
build it?

kind regards,
Steven


More information about the mvapich-discuss mailing list