[mvapich-discuss] Process Termination Detection with mpirun_rsh
Dhabaleswar Panda
panda at cse.ohio-state.edu
Wed Aug 20 15:56:52 EDT 2008
Hi Tom and Fred,
Thanks for reporting this issue and also sending us the follow-up
comments. We had started some internal discussions today morning here and
were suspecting the role of rsh vs. ssh. This seems to be the case here.
We will look for a solution which can solve this rsh-related problem. We
will get back to you in a few days.
Thanks,
DK
On Wed, 20 Aug 2008, Tom Crockett wrote:
> Tom Crockett wrote:
> > Following abnormal process termination on the remote node, there will be
> > only one active rsh process and one defunct rsh process, confirming that
> > the remote processes have cleaned up and exited. So it seems that
> > mpirun_rsh is not responding properly to the death of a child process.
>
> I've been poking around in the source code for mpirun_rsh and mpispawn,
> and I think I've figured out what the trouble is -- I'm just not sure
> what to do about it. mpispawn is correctly noticing that one of its
> children has terminated abnormally, kills off all of its other children,
> and exits with a non-zero return code. So far, so good.
>
> Unfortunately, rsh (unlike ssh) does not propagate this return code back
> as its own exit status, and instead exits with a return code of 0.
> mpirun_rsh incorrectly interprets this to mean that the remote processes
> have terminated normally. Instead of jumping into its cleanup procedure
> to kill off the remaining processes in the job, it just sits around
> waiting for its other children to exit, which they will never do without
> outside intervention.
>
> One potential workaround would be to use ssh instead of rsh, but we much
> prefer to use rsh for spawning remote processes in our clusters. There
> are two main reasons for this: (1) rsh is simpler, faster, easier to
> configure, and less susceptible to breaking when users customize their
> personal settings, and (2) as a rule we disallow ssh and rlogin access
> to our compute nodes so that users will have fewer pathways to
> circumvent the job scheduler.
>
> So what is really needed is either a custom version of rsh which mirrors
> the return status of its remote command, or else some other mechanism by
> which mpispawn can notify mpirun_rsh when something bad happens to one
> of its children. I'm curious if the former already exists somewhere?
>
>
> > Interestingly, whether the master node detects the remote process
> > termination seems to depend on how the remote process dies. If I hit
> > the remote process with a SIGTERM, mpirun_rsh seems to notice and things
> > get cleaned up after a minute or two. If it terminates with something
> > else (e.g., a SIGSEGV), the job will sit there forever.
>
> I haven't dug deeply into this behavior yet, but I conjecture that the
> SIGTERM is being caught by the MPI processes and is being handled in
> MPI-land, whereas most other signals (such as SIGSEGV) are not being
> trapped at the application level.
>
> -Tom
>
> --
> Tom Crockett
>
> College of William and Mary email: twcroc at wm.edu
> IT/High Performance Computing Group phone: (757) 221-2762
> Savage House fax: (757) 221-2023
> P.O. Box 8795
> Williamsburg, VA 23187-8795
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list