[mvapich-discuss] Cleanup of killed jobs with torque+mvapich2

Sourav Chakraborty chakraborty.52 at buckeyemail.osu.edu
Tue Jul 31 13:06:39 EDT 2018


Hi Noam,

If the nodes are allocated exclusively (a node is not used by multiple jobs
concurrently), the epilog can kill all user processes (excluding root and
daemon processes). This is the method used on TACC Stampede. If the same
node can be shared by multiple users or jobs, you need to the find the
correct pids from the jobid. Unfortunately, I don't have an example script
that does this.

Regarding the -export option, currently that is the only supported option
to export all environment variables. The idea of setting a flag in
/etc/mv2.conf to do this by default seems very useful! We'll definitely
look into it. One work around can be creating an alias or a wrapper script
over mpirun_rsh.

Please let us know if you have any other queries.

Thanks,
Sourav



On Tue, Jul 31, 2018 at 9:43 AM Noam Bernstein <noam.bernstein at nrl.navy.mil>
wrote:

>
>
> > On Jul 31, 2018, at 8:16 AM, Noam Bernstein <noam.bernstein at nrl.navy.mil>
> wrote:
> >
> >>
> >
> > If you have any suggestions for example epilogue scripts, I would be
> happy to try that, but I’m not sure how to identify the mpirun_rsh-started
> processes otherwise.  I guess I could kill all jobs that don’t belong to
> root/etc.
>
> I was actually able to cobble something together using
> /proc/<PID>/environ.  I was hoping to use the content of the PBS_JOBID
> variable, but mpirun_rsh doesn’t export the environment to all the child
> processes by default.  I can pass “-export”, but is there any way to have
> that done by default (something in /etc/mv2.conf, e.g.)?
>
>
> Noam
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20180731/fed5efa4/attachment.html>


More information about the mvapich-discuss mailing list