[mvapich-discuss] Problem with MVAPICH2 (1.9, r6338) on InfiniBand

Steven Vancoillie steven.vancoillie at gmail.com
Mon Jul 8 17:29:18 EDT 2013


Hi Devendar,

I installed + ran these osu micro benchmarks (thanks for pointing me
there), and they showed the same problem.
However, I think you just solved it, as setting MV2_NUM_HCAS=2 works
fine. Even though I didn't know what I was doing, I thought I'ld just
try a number different from the default. As one has to set this
parameter manually, I guess it's not trivial to detect this? If so,
how could one find out what is the correct number?

Anyway, thanks a lot for your help!

greetings,
Steven

On Mon, Jul 8, 2013 at 9:37 PM, Devendar Bureddy
<bureddy at cse.ohio-state.edu> wrote:
> Hi Steven
>
> The ch3:mrail is the default configuration. You can see more details
> here:http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.9.html#x1-110004.4
> .
>
> Please do not confuse with "mrail" configuration with multi-rail network
> support. The use of multiple IB network lanes(HCA) is controlled with
> MV2_NUM_HCAS run-time parameter.
>
> The wrong usage of HCAs could show similar errors. Are basic osu-benchmarks
> running correctly in your setup?
>
> -Devendar
>
>
> On Mon, Jul 8, 2013 at 12:21 PM, Steven Vancoillie
> <steven.vancoillie at gmail.com> wrote:
>>
>> Hi,
>>
>> when running an application (Global Arrays test suite) on top of
>> MVAPICH2 (via ARMCI-MPI), I get the following error:
>>
>> [0->3] send desc error, wc_opcode=0
>> [0->3] wc.status=12, wc.wr_id=0x6cf5c8, wc.opcode=0, vbuf->phead->type=54
>> = MPIDI_CH3_PKT_CLOSE
>> [r5i1n6:mpi_rank_0][MPIDI_CH3I_MRAILI_Cq_poll]
>> src/mpid/ch3/channels/mrail/src/gen2/ibv_channel_manager.c:586: [] Got
>> completion with error 12, vendor code=0x81, dest rank=3
>>
>> I would be grateful if someone can help me resolve this issue.
>>
>> The is the output of mpiexec --version:
>>
>> HYDRA build details:
>>     Version:                                 3.0.3
>>     Release Date:                            unreleased development copy
>>     CC:                              icc
>>     CXX:                             icpc
>>     F77:                             ifort
>>     F90:                             ifort
>>     Configure options:
>> '--disable-option-checking' '--prefix=/apps/leuven/mvapich2/1.9_intel'
>> 'CC=icc' 'CXX=icpc' 'F77=ifort' '--enable-f77' '--enable-fc'
>> '--enable-cxx' '--enable-romio' '--enable-debuginfo' '--enable-mpe'
>> '--enable-shared' '--without-ftb' '--with-mpe' '--disable-ckpt'
>> '--disable-mcast' '--disable-checkerrors' '--enable-embedded-mode'
>> '--cache-file=/dev/null' '--srcdir=.' 'CFLAGS= -DNDEBUG -DNVALGRIND
>> -O2' 'LDFLAGS=-L/lib -L/lib -Wl,-rpath,/lib -L/lib' 'LIBS=-libumad
>> -libverbs -lrt -lnuma -lpthread ' 'CPPFLAGS=
>>
>> -I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/mpl/include
>>
>> -I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/mpl/include
>> -I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/openpa/src
>> -I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/openpa/src
>>
>> -I/data/leuven/source/mvapich2/1.9_intel/mvapich2-1.9-r6338/src/mpi/romio/include
>> -I/include -I/include'
>>     Process Manager:                         pmi
>>     Launchers available:                     ssh rsh fork slurm ll lsf
>> sge manual persist
>>     Topology libraries available:            hwloc
>>     Resource management kernels available:   user slurm ll lsf sge pbs
>> cobalt
>>     Checkpointing libraries available:
>>     Demux engines available:                 poll select
>>
>> Furthermore, someone suggested to me to build MVAPICH2 without mrail
>> support: "you should probably use the default build of mvapich instead
>> of the mrail build, unless you want to use multiple network lanes
>> simultaneously". Unfortunately, I can't really understand from the
>> user guide how I should do this, or even what multiple rail means. Is
>> there some online documentation that addresses this or how I should
>> build it?
>>
>> kind regards,
>> Steven
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>
>
> --
> Devendar


More information about the mvapich-discuss mailing list