[mvapich-discuss] mpirun_rsh -export quirks

Jonathan Perkins perkinjo at cse.ohio-state.edu
Fri Dec 13 18:02:18 EST 2013


Thank you for your report.

Yes, we don't overwrite already set environment variables.  Our
documentation can improve to explain the precedence of environment
variables and where they should be set.  Maybe your admins can try
using the mvapich2 configuration files (see
http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.9.html#x1-500006.3)
instead of setting the variables in /etc/profile.  This should avoid
this problem and allow users to overwrite the default system settings
more easily.


On Fri, Dec 13, 2013 at 4:34 PM, Lockwood, Glenn <glock at sdsc.edu> wrote:
> Hi
>
> It looks like "mpirun_rsh -export" does not overwrite environment variables that are already set in shell startup files (see src/pm/mpirun/environ.c:43 in mvapich2 1.9).  I wanted to raise the point here in case anyone else is running into this peculiar behavior since this quirk doesn't appear to be mentioned in the mvapich2 1.9 manual.
>
> The issue arose because our system has a dual-rail configuration (two HCAs per host) that causes mvapich2 to hang unless we explicitly export MV2_IBA_HCA=mlx4_0 and MV2_NUM_HCAS=1.  We have this set in /etc/profile so that mvapich2 jobs are single-rail by default, but found that having users do something like
>
> export MV2_IBA_HCA=mlx4_0:mlx4_1
> export MV2_NUM_HCAS=2
> mpirun_rsh -export -np X -hostfile Y ./a.out
>
> would export everything EXCEPT the MV2_IBA_HCA and MV2_NUM_HCAS variables, causing single-rail behavior to persist.  This was the result of /etc/profile touching these variables before mpispawn got launched, preventing mpispawn from setting them to the correct values.
>
> The obvious workaround was to explicitly pass these two variables, e.g.,
>
> mpirun_rsh -export -np X -hostfile Y MV2_IBA_HCA=mlx4_0:mlx4_1 MV2_NUM_HCAS=2 ./a.out
>
> which does work.  It seems like this conditional exporting of the job's submit environment might be worth documenting in the mvapich2 user manual though, as I would imagine many sites have default MV2_* variables set system-wide like we do.
>
> Glenn
>
>
> --
> Glenn K. Lockwood, Ph.D.
> User Services Group
> San Diego Supercomputer Center
> glock at sdsc.edu / (858) 246-1075
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list