[mvapich-discuss] Question on PBS

Jonathan Perkins perkinjo at cse.ohio-state.edu
Thu Mar 5 09:58:08 EST 2009


Ehsan:
Are you able to manually ssh into each node outside of PBS?  It looks
like you may have a problem with node17.  See if you can ssh into and
out of this node successfully.

On Thu, Mar 05, 2009 at 02:09:30PM +0000, Ehsan Roohi wrote:
> Dear All,
> 
> I got  the following error while trying to submit my job to HPC:
> 
> The .o file contains:
> 
> running mpdallexit on node17
> LAUNCHED mpd on node17  via
> RUNNING: mpd on node17
> LAUNCHED mpd on node16  via  node17
> LAUNCHED mpd on node15  via  node17
> 
> mpdboot_node17 (handle_mpd_output 373): from mpd on node16, invalid port info:
> /bin/sh: line 1: ssh: command not found
> 
> mpdtrace: cannot connect to local mpd (/tmp/mpd2.console_seb09103); possible ca
> uses:
>   1. no mpd is running on this host
>   2. an mpd is running but was started without a "console" (-n option)
> mpiexec_node17: cannot connect to local mpd (/tmp/mpd2.console_seb09103); possi
> ble causes:
>   1. no mpd is running on this host
>   2. an mpd is running but was started without a "console" (-n option)
> mpdallexit: cannot connect to local mpd (/tmp/mpd2.console_seb09103); possible
> causes:
>   1. no mpd is running on this host
>   2. an mpd is running but was started without a "console" (-n option)
> 
> The PBS script that I use is:
> 
> #!/bin/bash
> 
> #PBS -l walltime=00:15:00
> #PBS -l nodes=4:ppn=2
> #PBS -V
> #PBS -N testjob
> # set echo               # echo commands before execution; use for debugging
> cd $SCR
> 
> # get executable and input files from mass storage
> #msscmd cd dir1, get a.out, mget *.input
> # mss doesn't keep executable bit set, so need to set it on program
> #chmod +x a.out
> 
> #mvapich2-start-mpd
> export NP=`wc -l ${PBS_NODEFILE} | cut -d'/' -f1`
> export MPDSNP=`uniq ${PBS_NODEFILE} |wc -l| cut -d'/' -f1`
> cat ${PBS_NODEFILE} | uniq > /tmp/mpd_nodefile_${USER}_$$
> export MPD_NODEFILE=/tmp/mpd_nodefile_${USER}_$$
> mpdboot -v -n ${MPDSNP} -f ${MPD_NODEFILE}
> mpdtrace -l
> rm ${MPD_NODEFILE}
> rm -f /tmp/mypbsnodes${USER}_$$
> export NP= `wc -l ${PBS_NODEFILE} | cut -d'/' -f1`
> export MV2_SRQ_SIZE=4000
> mpirun  -machinefile ${PBS_NODEFILE}  a.out
> mpdallexit
> ---------------------------------------------------------------------------------------------------------------------
> 
> 
> Would you please help me in this problem?
> 
> Thanks,
> Ehsan
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20090305/1b383c3d/attachment.bin


More information about the mvapich-discuss mailing list