[mvapich-discuss] hydra, stdin close(), and SLURM

Sourav Chakraborty chakraborty.52 at buckeyemail.osu.edu
Sat Jul 25 01:28:46 EDT 2015


Hi Aaron,

Thanks for you note. Unfortunately we can be of little help here as Hydra
is designed and maintained by the MPICH team. Can you please contact the
MPICH team with this suggestion?

Thanks,
Sourav


On Fri, Jul 24, 2015 at 8:15 PM, Aaron Knister <aaron.s.knister at nasa.gov>
wrote:

> This is a bit of a cross post from a thread I started on the slurm dev
> list: http://article.gmane.org/gmane.comp.distributed.slurm.devel/8176
>
> I'd like to get feedback on the idea that "--input none" be passed to srun
> when using the SLURM hydra bootstrap mechanism. I figured it would be
> inserted here
> http://trac.mpich.org/projects/mpich/browser/src/pm/hydra/tools/bootstrap/external/slurm_launch.c#L98
> .
>
> Without this argument I'm getting spurious job aborts and confusing
> errors. The gist of it is mpiexec.hydra closes stdin before it exec's srun.
> srun then (possibly via the munge libraries) calls some function that does
> a look up via nss. We use sssd for AAA so libnss_sssd will handle this
> request. Part of the caching mechanism sssd uses will cause the library to
> open() the cache file. The lowest fd available is 0. srun then believes
> it's got stdin attached and it causes the issues outlined in the slurm dev
> post. I think passing "--input none" is the right thing to do here since
> hydra has in fact closed stdin to srun. I tested this via the
> HYDRA_LAUNCHER_EXTRA_ARGS environment variable and it does resolve the
> errors I described.
>
> Thanks!
> -Aaron
>
> --
> Aaron Knister
> NASA Center for Climate Simulation (Code 606.2)
> Goddard Space Flight Center
> (301) 286-2776
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150724/b8716dd2/attachment-0001.html>


More information about the mvapich-discuss mailing list