[mvapich-discuss] Question on PBS

Ehsan Roohi ehsan.roohi at strath.ac.uk
Thu Mar 5 09:09:30 EST 2009


Dear All,

I got  the following error while trying to submit my job to HPC:

The .o file contains:

running mpdallexit on node17
LAUNCHED mpd on node17  via
RUNNING: mpd on node17
LAUNCHED mpd on node16  via  node17
LAUNCHED mpd on node15  via  node17

mpdboot_node17 (handle_mpd_output 373): from mpd on node16, invalid port info:
/bin/sh: line 1: ssh: command not found

mpdtrace: cannot connect to local mpd (/tmp/mpd2.console_seb09103); possible ca
uses:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)
mpiexec_node17: cannot connect to local mpd (/tmp/mpd2.console_seb09103); possi
ble causes:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)
mpdallexit: cannot connect to local mpd (/tmp/mpd2.console_seb09103); possible
causes:
  1. no mpd is running on this host
  2. an mpd is running but was started without a "console" (-n option)

The PBS script that I use is:

#!/bin/bash

#PBS -l walltime=00:15:00
#PBS -l nodes=4:ppn=2
#PBS -V
#PBS -N testjob
# set echo               # echo commands before execution; use for debugging
cd $SCR

# get executable and input files from mass storage
#msscmd cd dir1, get a.out, mget *.input
# mss doesn't keep executable bit set, so need to set it on program
#chmod +x a.out

#mvapich2-start-mpd
export NP=`wc -l ${PBS_NODEFILE} | cut -d'/' -f1`
export MPDSNP=`uniq ${PBS_NODEFILE} |wc -l| cut -d'/' -f1`
cat ${PBS_NODEFILE} | uniq > /tmp/mpd_nodefile_${USER}_$$
export MPD_NODEFILE=/tmp/mpd_nodefile_${USER}_$$
mpdboot -v -n ${MPDSNP} -f ${MPD_NODEFILE}
mpdtrace -l
rm ${MPD_NODEFILE}
rm -f /tmp/mypbsnodes${USER}_$$
export NP= `wc -l ${PBS_NODEFILE} | cut -d'/' -f1`
export MV2_SRQ_SIZE=4000
mpirun  -machinefile ${PBS_NODEFILE}  a.out
mpdallexit
---------------------------------------------------------------------------------------------------------------------


Would you please help me in this problem?

Thanks,
Ehsan



More information about the mvapich-discuss mailing list