[mvapich-discuss] error in ibv_channel_manager.c

Devendar Bureddy bureddy at cse.ohio-state.edu
Fri Aug 9 14:17:47 EDT 2013


Hi Ben

The event 0 (IBV_EVENT_CQ_ERR ) triggered from hardware when
configured IBwork completion queue(
CQ) entries are overrun. It seems, in your case root is doing exhaustive
send/recvs with out polling which might causing this overrun of
default CQentries (40000). Can you please try with increased
CQ size using parameter MV2_DEFAULT_MAX_CQ_SIZE=64000 (or more) and see if
that has any effect.

And also, it seems the you are using mvapich2-1.8.1, which is old. Please
upgrade to latest stable release i,e 1.9.

-Devendar




On Fri, Aug 9, 2013 at 1:18 PM, Ben <Benjamin.M.Auer at nasa.gov> wrote:

> I'm getting this error message from mvapich while running a program at
> large core count. It appears to be during a read being performed by the
> root process. Does anyone have any ideas on what could be causing it?
>
>
> Got FATAL event 0
> at line 990 in file ibv_channel_manager.c
>
> my run command is:
>
> mpirun_rsh -hostfile $PBS_NODEIFLE -np 7200 MV2_USE_UD_HYBRID=0 executable
>
>
> and a mpiname -a shows:
>
> MVAPICH2 1.8.1 Thu Sep 27 18:55:23 EDT 2012 ch3:mrail
>
> Compilation
> CC: icc -fpic -m64   -DNDEBUG -DNVALGRIND -O2
> CXX: icpc -fpic -m64  -DNDEBUG -DNVALGRIND -O2
> F77: ifort -fpic  -O2
> FC: ifort -fpic  -O2
>
> Configuration
> CC=icc CXX=icpc F77=ifort FC=ifort CFLAGS=-fpic -m64 CXXFLAGS=-fpic -m64
> FFLAGS=-fpic FCFLAGS=-fpic --enable-f77 --enable-fc --enable-cxx
> --enable-romio --enable-threads=default --with-hwloc
> --disable-multi-aliases --enable-xrc=yes --enable-hybrid
> --prefix=/usr/local/other/**SLES11.1/mvapich2/1.8.1/intel-**13.1.2.183
>
> --
> Ben Auer, PhD   SSAI, Scientific Programmer/Analyst
> NASA GSFC,  Global Modeling and Assimilation Office
> Code 610.1, 8800 Greenbelt Rd, Greenbelt, MD  20771
> Phone: 301-286-9176               Fax: 301-614-6246
>
> ______________________________**_________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-**state.edu <mvapich-discuss at cse.ohio-state.edu>
> http://mail.cse.ohio-state.**edu/mailman/listinfo/mvapich-**discuss<http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss>
>



-- 
Devendar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130809/31d5a4b4/attachment.html


More information about the mvapich-discuss mailing list