[mvapich-discuss] viacheck.c error?

Abhinav Vishnu vishnu at cse.ohio-state.edu
Thu Feb 22 16:50:28 EST 2007


Hi jen,

> I'm not 100% sure of what information will be most helpful, but the
> error output for osu_bw (as an example) is:
> --------------------------------------------------
> Connection closed by 172.16.4.36^M
> [0] Abort: [b09n040.oic.ornl.gov:0] Got completion with error, code=1
> at line 2355 in file viacheck.c
> done.
> ---------------------------------------------------
>
I think the problem is occuring, because your ssh connection
got terminated during the execution of the application. As a result,
any process which tries to communicate with the process present
on the node which died, it will get the "completion with error" during
data transmission. IMHO, your sysadmin should be able to help you
with respect to terminating ssh connection.

>
> The BUILD_ID file of my ofed is:
> ---------------------------------------------------------------------
> OFED-1.0
>
> openib-1.0 (REV=8031)
> # User space
> https://openib.org/svn/gen2/branches/1.0/src/userspace
> # Kernel space
> https://openib.org/svn/gen2/branches/1.0/ofed/tags/1.0/linux-kernel
> Git:
> ref: refs/heads/for-2.6.17
> commit 959eb39297e8c82f61fbfc283ad4ff11c883bf1e
>
> # MPI
> mpi_osu-0.9.7-mlx2.1.0.tgz
> openmpi-1.1b1-1.src.rpm
> mpitests-1.0-0.src.rpm
> --------------------------------------------------------------------------- 
>
>
> so that may be a problem - that it is ofed 1.0?
>
>
> ------------------------------------------------------------------------------- 
>
> enum ibv_event_type {
>        IBV_EVENT_CQ_ERR,
>        IBV_EVENT_QP_FATAL,
>        IBV_EVENT_QP_REQ_ERR,
>        IBV_EVENT_QP_ACCESS_ERR,
>        IBV_EVENT_COMM_EST,
>        IBV_EVENT_SQ_DRAINED,
>        IBV_EVENT_PATH_MIG,
>        IBV_EVENT_PATH_MIG_ERR,
>        IBV_EVENT_DEVICE_FATAL,
>        IBV_EVENT_PORT_ACTIVE,
>        IBV_EVENT_PORT_ERR,
>        IBV_EVENT_LID_CHANGE,
>        IBV_EVENT_PKEY_CHANGE,
>        IBV_EVENT_SM_CHANGE,
>        IBV_EVENT_SRQ_ERR,
>        IBV_EVENT_SRQ_LIMIT_REACHED,
>        IBV_EVENT_QP_LAST_WQE_REACHED
> };
> ------------------------------------------------------------------------------------ 
>
>
> Soooo, I assume my new mission is to get ofed 1.1?   :}
>

Yes, i guess this should be the safest bet. Please let us know
the outcome of your experimentation.

Thanks,

:- Abhinav
> Thanks!!!!
> Jen
>
>



More information about the mvapich-discuss mailing list