[mvapich-discuss] messege truncated
Justin
luitjens at cs.utah.edu
Thu Nov 20 10:57:42 EST 2008
The message means mpi received a message larger than the buffer size you
specified. Namely in this case the buffer length is '-514665432' thus
any length of message would be bigger than it. What I find odd is the
parameters you are sending MPI_Recv. You are sending a count of
'945075466' are you really sending a message that is a gigabyte in
size? It might be possible that the count is being converted to a
signed int causing it to wrap to a negative number. Check the size that
you are specifying for the buffer. It is odd that you have it specified
to be a GB in size when you are only receiving 2 bytes.
nilesh awate wrote:
>
> Thanks for suggestion (use mvapich2-1.2) sir,
>
> I have tried the same but still we are facing same problem
>
> Fatal error in MPI_Recv:
> Message truncated, error stack:
> MPI_Recv(186).......................: MPI_Recv(buf=0x7fff1faf6008,
> count=945075466, MPI_INT, src=2, tag=1000, MPI_COMM_WORLD,
> status=0x7fff1faf5fe0) failed
> MPIDI_CH3U_Request_unpack_uebuf(590): Message truncated; 4 bytes
> received but buffer size is -514665432
> rank 0 in job 4 test01_52519 caused collective abort of all ranks
> exit status of rank 0: killed by signal 9
>
> is there any suggestion ?
>
> what does this error mean mean ?
>
> is this a result of data curruption/packet missing, or something else ?
>
> wating for reply
> Nilesh Awate
>
>
>
> ------------------------------------------------------------------------
> *From:* Dhabaleswar Panda <panda at cse.ohio-state.edu>
> *To:* nilesh awate <nilesh_awate at yahoo.com>
> *Cc:* MVAPICH2 <mvapich-discuss at cse.ohio-state.edu>
> *Sent:* Wednesday, 19 November, 2008 9:27:36 PM
> *Subject:* Re: [mvapich-discuss] messege truncated
>
> MVAPICH2 1.2 was released around two weeks back. Can you try the latest
> version.
>
> DK
>
> On Wed, 19 Nov 2008, nilesh awate wrote:
>
> > Hi all,
> I am using mvapich2-1.0.3 with dapl interconnect (its a
> proprietary nic & dapl library)
> I got following error while running pallas over (amd dual core) 5
> nodes cluster.
>
> Fatal error in MPI_Recv:
> Message truncated, error stack:
> MPI_Recv(186)..........................: MPI_Recv(buf=0x7fff24744cec,
> count=952788905, MPI_INT, src=2, tag=1000,MPI_COMM_WORLD,
> status=0x7fff24744cd0) failed
> MPIDI_CH3U_Post_data_receive_found(243): Message from rank 2 and tag
> 1000 truncated; 4 bytes received but buffersize is -483811676
> rank 0 in job 2 test01_40634 caused collective abort of all ranks
> exit status of rank 0: killed by signal 9
>
>
> will you suggest where we should look for solving above error ?
> what can we interpret from above message ?
>
> wating for reply
> thanking
> Nilesh
>
>
> Bring your gang together. Do your thing. Find your favourite
> Yahoo! group at http://in.promos.yahoo.com/groups/
>
>
> ------------------------------------------------------------------------
> Add more friends to your messenger and enjoy! Invite them now.
> <http://in.rd.yahoo.com/tagline_messenger_6/*http://messenger.yahoo.com/invite/>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list