[mvapich-discuss] vapi error on scyld configuration

Matt Funk matze999 at gmx.net
Fri Apr 13 11:09:01 EDT 2007


Hi,

I am running a straightforward Finite Difference code on a machine with 
infiniband. The Finite Difference code is such that i do the same 
computations over and over again. I run the code on 32 procs with infiniband 
on a scyld configuration machine. 
After running for about 43000 timsteps (for each timestep the same 
computations are done) i get the following error:

[10.1.1.144:18] Fatal Error: Got an asynchronous event: VAPI_PORT_ERROR 
(VAPI_EV_SYNDROME_NONE) at line 110 in file viainit.c
[10.1.1.120:5] Got completion with error, code=VAPI_RETRY_EXC_ERR, vendor 
code=81, for 10.1.1.144:18
testNodePoisson2d.Linux.g++.g77.MPI32.ex: viacheck.c:2176: 
viutil_spinandwaitcq: Assertion `sc->status == VAPI_SUCCESS' failed.


I would think that this is a bug in my code but given that it does the same 
thing over and over again for an hour and half before croaking i wouldn't 
even know where to look for the bug. Anyway, does someone know what this 
might indicate?
The MPICH version i am usng is 1.2.5

thanks
mat


More information about the mvapich-discuss mailing list