[mvapich-discuss] vapi error on scyld configuration

Matt Funk matze999 at gmx.net
Fri Apr 13 11:57:24 EDT 2007


Hi Abhinav,

thank you very much for the advice.

My only question before i begin is whether i can do this without admin 
privileges? I am only a user on the machine. I have asked the admin to 
upgrade several times, but nothing has happened. He says that he hasn't 
received any new software from the vendor.


thanks
mat

On Friday 13 April 2007 09:41, Abhinav Vishnu wrote:
> Hi Matt,
>
> Thanks for using MVAPICH and reporting the problem to us. I believe that
> you are using a *very old* version of MVAPICH with VAPI drivers. May i
> suggest you to upgrade the MPI stack and the userspace libraries. We
> have recently released MVAPICH-0.9.9 beta and planning to release the
> final version in a week or so. Please download it from the MVAPICHwebpage.
>
> http://nowlab.cse.ohio-state.edu/projects/mpi-iba/
>
> Also, please upgrade the userspace libraries to OFED from the following
> URL:
>
> http://www.openfabrics.org/downloads.htm
>
> You may also need to upgrade your firmware for adapters and/or switches:
>
> http://www.mellanox.com/support/firmware_download.php
>
> Please re-run your applications with the settings above and let us know
> the outcome of your experimentation.
>
> Thanks,
>
> :- Abhinav
>
> * On Apr,1 Matt Funk<matze999 at gmx.net> wrote :
> > Hi,
> >
> > I am running a straightforward Finite Difference code on a machine with
> > infiniband. The Finite Difference code is such that i do the same
> > computations over and over again. I run the code on 32 procs with
> > infiniband on a scyld configuration machine.
> > After running for about 43000 timsteps (for each timestep the same
> > computations are done) i get the following error:
> >
> > [10.1.1.144:18] Fatal Error: Got an asynchronous event: VAPI_PORT_ERROR
> > (VAPI_EV_SYNDROME_NONE) at line 110 in file viainit.c
> > [10.1.1.120:5] Got completion with error, code=VAPI_RETRY_EXC_ERR, vendor
> > code=81, for 10.1.1.144:18
> > testNodePoisson2d.Linux.g++.g77.MPI32.ex: viacheck.c:2176:
> > viutil_spinandwaitcq: Assertion `sc->status == VAPI_SUCCESS' failed.
> >
> >
> > I would think that this is a bug in my code but given that it does the
> > same thing over and over again for an hour and half before croaking i
> > wouldn't even know where to look for the bug. Anyway, does someone know
> > what this might indicate?
> > The MPICH version i am usng is 1.2.5
> >
> > thanks
> > mat
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss


More information about the mvapich-discuss mailing list