[mvapich-discuss] stack smashing detected

Sayantan Sur surs at cse.ohio-state.edu
Tue Nov 9 17:03:30 EST 2010


Hi Thiago,

I just tried it out, and it seems to work:

head$ ./bin/mpirun_rsh -np 1 node133 ./examples/cpi
Process 0 of 1 is on node133.cluster
pi is approximately 3.1415926544231341, Error is 0.0000000008333410
wall clock time = 0.000320

Are there any specific mpi programs with which you see this more
often? Also, do you have any error output of when this happens?

Thanks.

On Tue, Nov 9, 2010 at 4:40 PM, Thiago Ize <thiago at sci.utah.edu> wrote:
> I'm the one who's found that problem.  I've had this happen with several
> versions.  mvapich that comes with the system and the mvapich2-1.5.1 that I
> downloaded and compiled myself.
>
> The 1.5.1 version works if I run locally, probably because it's not using
> ssh?  But if I try to run on a remote node I get the same error.  For
> example:
> node1 $ mpirun_rsh -np1 node1 mpiProgram -> works
> node1 $ mpirun_rsh -np1 node2 mpiProgram -> fails
>
> Also, if I go on a node where this still works, I can still run on the "bad"
> remote nodes.  For example
> nodeGood $ mpirun_rsh -np1 node2 mpiProgram -> works
> node1 $ mpirun_rsh -np1 node2 mpiProgram -> does not
>
> Thiago
>
> Sayantan Sur wrote:
>
> Hi Nick,
>
> On Tue, Nov 9, 2010 at 2:13 PM, Nick Rathke <nick at sci.utah.edu> wrote:
>
>
> Hi,
>
> We have a small 64 node cluster running RHEL 5.4 and mvapich 1.2.0 and we
> have started getting the error " *** stack smashing detected ***:
> /usr/bin/ssh terminated " on some of our nodes but not others, when all of
> the node are identical.
>
> I have been searching the web for this error but haven't found anything that
> would help me debug this or even tell if this is a mvapich or ssh error.
>
> Any thoughts would be greatly appreciated.
>
>
>
> Just wondering if you saw this with any older MVAPICH version (say,
> MVAPICH-1.1) or some of the newer MVAPICH2 releases?
>
> If you could try MVAPICH2-1.5.1 and see if this error persists, it
> will be great.
>
> Thanks.
>
>
>
> Nick Rathke
> Scientific Computing and Imaging Institute
> IT Manager and Sr. Systems Administrator
> nick at sci.utah.edu
> www.sci.utah.edu
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>
>
>



-- 
Sayantan Sur

Research Scientist
Department of Computer Science
http://www.cse.ohio-state.edu/~surs



More information about the mvapich-discuss mailing list