[mvapich-discuss] MVAPICH Error

Dhabaleswar Panda panda at cse.ohio-state.edu
Sat Aug 4 08:14:30 EDT 2007


Hi Thomas, 

Are you seeing this behavior with MVAPICH2 0.9.8p2 with the patch
Gopal had sent to you on July 7th?

Have you tried MVAPICH2 0.9.8p3 or the latest release MVAPICH2
1.0-beta.  Do you see the same behavior with these two versions
also. In these versions we have applied a better solution to the
problem you had reported originally.

If you can let us know which version you are using currently, it will
help us to narrow down the problem further.

Best Regards, 

DK

> Hello again,
> 
> Thanks for all your help in the past; I've been able to get my code up
> and running on a small 32 processor cluster. I'm doing scaling tests and
> I ran with an array size of 16x16x16 with 1,2,4,8 and 16 processors and
> saw fairly good scaling. When I increased the array sizes to 32x32x32 my
> code runs fine for all but the 8 processor case. The odd part is that is
> doesn't crash until the 15th iteration, and I'm doing 21 iterations for
> each case. Here is the error it produces:
> 
> =20
> 
> ch3_rndvtransfer.c:614: MPIDI_CH3_Get_rndv_push: Assertion
> '(get_resp_pkt->seqnum) + 1 =3D=3D (vc)->seqnum_send' failed.
> 
> =20
> 
> I imagine this will be a pain for me to debug since it takes about 30
> minutes to get to the point where it fails. Ever seen this error or have
> any idea what might be causing it? Any tips would be greatly
> appreciated.=20
> 
> =20
> 
> Thanks,
> 
> Thomas O'Shea




More information about the mvapich-discuss mailing list