[mvapich-discuss] Mvapich2-1.2 for OpenFabrics IB/iWARP : Job terminates with error

Dhabaleswar Panda panda at cse.ohio-state.edu
Tue Feb 17 09:57:36 EST 2009


Code 12 is a timeout -- could be a bad cable/HCA/switch leaf. If the
system is really large then it could be congestion.

Thanks,

DK

On Tue, 17 Feb 2009, Vivek Gavane wrote:

> Hello,
>      I have mvapich2-1.2 compiled with the following options:
>
>
> /configure --with-rdma=gen2 --enable-sharedlibs=gcc --enable-g=dbg
> --enable-debuginfo --with-ib-include=/opt/OFED/include
> --with-ib-libpath=/opt/OFED/lib64 --prefix=/home/apps/mvapich2-1.2
>
> After I submit a job, the job completes but the following errors are
> reported on the console:
>
> -------------------------------------------------------------
> send desc error
> Exit code -5 signaled from ibc0-16
> Killing remote processes...[14] Abort: [] Got completion with error 12,
> vendor code=81, dest rank=0
>  at line 553 in file ibv_channel_manager.c
> MPI process terminated unexpectedly
> DONE
> ------------------------------------------------------------
>
> And in the redirected output file, following errors are reported at the
> end:
> -----------------------------------------
> cleanupSignal 15 received.
> Signal 15 received.
> Signal 15 received.
> Signal 15 received.
> -----------------------------------------
>
> Do anyone know the reason for this?
>
> Thanks in advance.
> --
> Regards,
> Vivek Gavane
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list