[mvapich-discuss] messege truncated

Justin luitjens at cs.utah.edu
Thu Nov 20 10:57:42 EST 2008


The message means mpi received a message larger than the buffer size you 
specified.  Namely in this case the buffer length is '-514665432'  thus 
any length of message would be bigger than it.  What I find odd is the 
parameters you are sending MPI_Recv.  You are sending a count of 
'945075466'  are you really sending a message that is a gigabyte in 
size?  It might be possible that the count is being converted to a 
signed int causing it to wrap to a negative number.  Check the size that 
you are specifying for the buffer.  It is odd that you have it specified 
to be a GB in size when you are only receiving 2 bytes.
nilesh awate wrote:
>
> Thanks for suggestion (use mvapich2-1.2) sir,
>
> I have tried the same but still we are facing same problem
>
> Fatal error in MPI_Recv:
> Message truncated, error stack:
> MPI_Recv(186).......................: MPI_Recv(buf=0x7fff1faf6008, 
> count=945075466, MPI_INT, src=2, tag=1000, MPI_COMM_WORLD, 
> status=0x7fff1faf5fe0) failed
> MPIDI_CH3U_Request_unpack_uebuf(590): Message truncated; 4 bytes 
> received but buffer size is -514665432
> rank 0 in job 4  test01_52519   caused collective abort of all ranks
> exit status of rank 0: killed by signal 9
>
> is there any suggestion ?
>
> what does this error mean mean ?
>
> is this a result of data curruption/packet missing, or something else ?
>
> wating for reply
> Nilesh Awate
>
>
>
> ------------------------------------------------------------------------
> *From:* Dhabaleswar Panda <panda at cse.ohio-state.edu>
> *To:* nilesh awate <nilesh_awate at yahoo.com>
> *Cc:* MVAPICH2 <mvapich-discuss at cse.ohio-state.edu>
> *Sent:* Wednesday, 19 November, 2008 9:27:36 PM
> *Subject:* Re: [mvapich-discuss] messege truncated
>
> MVAPICH2 1.2 was released around two weeks back. Can you try the latest
> version.
>
> DK
>
> On Wed, 19 Nov 2008, nilesh awate wrote:
>
> > Hi all,
> I  am using  mvapich2-1.0.3  with  dapl  interconnect (its a 
> proprietary  nic & dapl library)
> I got following error while running pallas over (amd dual core) 5 
> nodes cluster.
>
> Fatal error in MPI_Recv:
> Message truncated, error stack:
> MPI_Recv(186)..........................: MPI_Recv(buf=0x7fff24744cec, 
> count=952788905, MPI_INT, src=2, tag=1000,MPI_COMM_WORLD, 
> status=0x7fff24744cd0) failed
> MPIDI_CH3U_Post_data_receive_found(243): Message from rank 2 and tag 
> 1000 truncated; 4 bytes received but buffersize is -483811676
> rank 0 in job 2  test01_40634  caused collective abort of all ranks
>   exit status of rank 0: killed by signal 9
>
>
> will you suggest where we should look for solving above error ?
> what can we interpret from above message ?
>
> wating for reply
> thanking
> Nilesh
>
>
>       Bring your gang together. Do your thing. Find your favourite 
> Yahoo! group at http://in.promos.yahoo.com/groups/
>
>
> ------------------------------------------------------------------------
> Add more friends to your messenger and enjoy! Invite them now. 
> <http://in.rd.yahoo.com/tagline_messenger_6/*http://messenger.yahoo.com/invite/> 
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>   



More information about the mvapich-discuss mailing list