[mvapich-discuss] PBS Pro's qstat failed to show "Time Use" if building application with mvapich2

Jianyu Liu jerry_leo at msn.com
Sun Dec 22 22:06:14 EST 2013


Hi,

I  set RSH_CMD=/opt/pbs/default/bin/pbsdsh and rebuilt mvapich2 with --enable-rsh, then submit job with mpirun_rsh like this

mpirun_rsh -n np $nprocs -hostfile  $PBS_NODEFILE ./wrf.exe

stdout  showed 'Timeout during client startup'
stderr showed '/usr/pbs/default/bin/pbsdsh: task 0x00000000 exit status 254'

It looked like the system disabled rsh and only allow ssh connect

Jerry

> Date: Sat, 21 Dec 2013 22:45:46 -0500
> Subject: Re: [mvapich-discuss] PBS Pro's qstat failed to show "Time Use" if building application with mvapich2
> From: perkinjo at cse.ohio-state.edu
> To: jerry_leo at msn.com
> CC: mvapich-discuss at cse.ohio-state.edu
> 
> Thanks for providing the info.  There aren't any additional options
> you can use to make this work except for possibly replacing the
> rsh/ssh command used by mpirun_rsh and trying that.
> 
> Is there something like pbsdsh under /opt/pbs/bin?  If so you can try
> rebuilding mvapich2 by setting RSH_CMD to that when configuring and
> using mpirun_rsh in your submit scripts.
> 
> On Sat, Dec 21, 2013 at 9:15 PM, Jianyu Liu <jerry_leo at msn.com> wrote:
> > HI,
> >
> > Here are the output of 'mpiname -a'
> >
> > MVAPICH2 2.0b Fri Nov  8 11:17:40 EST 2013 ch3:mrail
> >
> > Compilation
> > CC: pgcc    -DNDEBUG -DNVALGRIND -O2
> > CXX: pgCC   -DNDEBUG -DNVALGRIND
> > F77: pgf77 -L/usr/lib64 -L/lib -L/lib
> > FC: pgf90
> >
> > Configuration
> > --prefix=/nuist/p/public/app/mvapich2/2.0b/pgi -with-ib-libpath=/usr/lib64
> > --with-ib-include=/usr/include --enable-f77 --enable-fc CC=pgcc CXX=pgCC
> > F77=pgf77 FC=pgf90
> >
> >
> > These is no rsh in /opt/pbs/bin.
> >
> > I submit my job with mpirun (in /nuist/p/public/app/mvapich2/2.0b/pgi/bin),
> > just like this
> >
> >     mpirun -np $nprocs ./wrf.exe
> >
> >
> > Jerry
> >
> >> Date: Sat, 21 Dec 2013 16:32:43 -0500
> >> Subject: Re: [mvapich-discuss] PBS Pro's qstat failed to show "Time Use"
> >> if building application with mvapich2
> >> From: perkinjo at cse.ohio-state.edu
> >> To: jerry_leo at msn.com
> >> CC: mvapich-discuss at cse.ohio-state.edu
> >
> >>
> >> I'd like to know how MVAPICH2 was built and how you are running your
> >> jobs. You can run `mpiname -a` and/or the output from config.log from
> >> your build directory if you build MVAPICH2 yourself. Can you also
> >> provide information on how you submit your jobs such as whether you're
> >> using mpirun_rsh or mpiexec?
> >>
> >> I believe that using RSH_CMD=/path/to/pbs/rsh and --enable-rsh at
> >> build time will give the desired results if you're using mpirun_rsh
> >> with PBS Pro. You should also get the desired results by using hydra
> >> (mpiexec).
> >>
> >>
> >> On Sat, Dec 21, 2013 at 2:13 AM, Jianyu Liu <jerry_leo at msn.com> wrote:
> >> > Hi,
> >> >
> >> >
> >> > Here are env. info
> >> >
> >> > Job scheduler : PBSPro 11.3.0.
> >> > OS: Red Hat Enterprise Linux Server release 6.2
> >> > mvapich2 : 2.0b
> >> > InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s -
> >> > IB QDR / 10GigE] (rev b0)
> >> > mlx4_core: 1.0-mlnx_ofed1.5.3
> >> > mlx4_en: 1.5.8.3 (June 2012)
> >> >
> >> >
> >> > If application was built with mavpich2, while checking job status with
> >> > qstat, it failed to show "Time use", just like this.
> >> >
> >> >
> >> > [jliu at log08~]# qstat
> >> > Job id Name User Time Use S Queue
> >> > ---------------- ---------------- ---------------- -------- - -----
> >> > 94540.log05 WRF3 jliu 00:00:00 R Regular
> >> >
> >> >
> >> > Job works fine, just the "Time Use" always shows as ZERO.
> >> >
> >> >
> >> > But if application was built with OpenMPI, there is NO this sort of
> >> > issue.
> >> >
> >> >
> >> > My question is
> >> >
> >> > how to build mvapich2 to support PBS Pro or any run-time options which
> >> > can make qstat show "Time use" properly ?
> >> >
> >> >
> >> > Thanks
> >> >
> >> >
> >> > Jerry
> >> >
> >> > _______________________________________________
> >> > mvapich-discuss mailing list
> >> > mvapich-discuss at cse.ohio-state.edu
> >> > http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jonathan Perkins
> >> http://www.cse.ohio-state.edu/~perkinjo
> >>
> 
> 
> 
> -- 
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo
> 
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131223/e8cf40b1/attachment.html>


More information about the mvapich-discuss mailing list