[mvapich-discuss] [rsh] <defunct> ????

Sayantan Sur surs at cse.ohio-state.edu
Wed Mar 8 17:44:52 EST 2006


Hi Michael,

> >Can you please tell us if your application proceeds to completion?
> yes.

Glad to know that your applications are running fine.

> >Normally, mpirun_rsh uses `ssh' to exec jobs, unless specifically told
> >to use rsh (by using the -rsh flag). Do these "defunct" processes linger
> >around or do they dissapear when the job finishes?  As far as I know,
> I believe that "defunct" processes disappear when the job finishes.
> My application normally takes long time, I did not pay much attention
> to the "defunct" processes. When I kill my application, the "defunct"
> processes disappear.

When mpirun_rsh execs the remote processes, it starts them with a
command line like:

$ rsh n0 /home/user/prog/a.out

Hence, the rsh process (which spawned the process on n0) will be hanging
around the master (i.e. the node from which you started the job) node
till the process on n0 (a.out, which was your MPI program) finishes.

I would hazard a guess that on your cluster, these rsh processes on the
master node are showing up as "defunct". Do any of the other nodes in
the computation also show these processes?

Since the applications are proceeding to completion correctly and the
"defunct" processes dissapear after the program ends, I'd say that there
isn't any major problem with your installation.

> I always use rsh. The reason is that performance concern.
> This raises another question : in general, should use rsh or ssh?
> From performance point of view, I believe that should use rsh.
> 
> I always manually change mpirun.vapi as follows:
> 
>  # org    $Show $MPIRUN_HOME/mpirun_rsh $via_args $progname $cmdLineArgs
>     $Show $MPIRUN_HOME/mpirun_rsh -rsh $via_args $progname $cmdLineArgs
> 
> Can you explain why you choose ssh as default ?

We chose ssh as default since many Linux distributions disable rsh
because of security reasons. We believe the choice of rsh/ssh shouldn't
impact end application performance, because once the jobs are launched
all communications happen over InfiniBand (and unencrypted).

However, the final choice of rsh/ssh is left to the user.

> I have not upgraded to 1.8.1 or 1.8.2 yet, because I am afraid of that
> it may cause other unknown issues.

OK. If you face any problems with the current installation, we'll be
glad if you could consider upgrading. The latest IBGD stacks feature
updated firmware too.

Thanks,
Sayantan.

-- 
http://www.cse.ohio-state.edu/~surs


More information about the mvapich-discuss mailing list