[mvapich-discuss] Problem running Mvapich2 1.8 on Centos 6.3

Jonathan Perkins perkinjo at cse.ohio-state.edu
Tue Jul 17 18:27:26 EDT 2012


Thanks for the update.  Glad that you were able to figure it out.

On Tue, Jul 17, 2012 at 04:24:42PM -0600, Craig Tierney wrote:
> Jonathan,
> 
> After fighting with the build process for a day, the problem is that
> my diskless nodes with Centos-6.3 did not have the proper rw permissions
> on some of the /dev/infiniband devices.
> 
> Changing this fixed the problem.
> 
> Thanks for the help.
> 
> Craig
> 
> On 7/16/12 8:35 PM, Jonathan Perkins wrote:
> > Hi Craig,
> > 
> > Can you try adding --enable-g=dbg and --disable-fast to your configure
> > option.  This should provide us more info to help debug the issue that
> > you're seeing.
> > 
> > On Mon, Jul 16, 2012 at 05:16:20PM -0600, Craig Tierney wrote:
> >> Hello,
> >>
> >> We are building up a new image and upgrading Mvapich2 to the latest, 1.8.
> >> When our image was Centos 6.2, I had no problems building and running basic
> >> MPI jobs.  I built mvapich2 with:
> >>
> >>  ./configure CC=icc CXX=icpc F77=ifort FC=ifort \
> >>        LDFLAGS="-Wl,-rpath $INTEL/compiler/lib/intel64" \
> >>        --prefix=/apps/mvapich2/1.8-intel \
> >>        --with-rdma=gen2 \
> >>       --with-ib-libpath=/usr/lib64 \
> >>        --enable-romio=yes --with-file-system=lustre+panfs --enable-shared
> >>
> >> But since Centos 6.3 just came out, We updated the image with all available packages
> >> and tried again.  I was no longer able to launch jobs.  I rebuilt mvapich2
> >> to see if that would help.  It didn't help.
> >>
> >> When I try to run a code, I get the following (full debug output below):
> >>
> >> [cli_0]: aborting job:
> >> Fatal error in MPI_Init:
> >> Other MPI error
> >>
> >> =====================================================================================
> >> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> >> =   EXIT CODE: 256
> >> =   CLEANING UP REMAINING PROCESSES
> >> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> >> =====================================================================================
> >> [cli_1]: aborting job:
> >> Fatal error in MPI_Init:
> >> Other MPI error
> >>
> >> =====================================================================================
> >> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> >> =   EXIT CODE: 256
> >> =   CLEANING UP REMAINING PROCESSES
> >> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> >> =====================================================================================
> >>
> >> Am I the first person to try this?
> >>
> >> Thanks,
> >> Craig
> >>
> >>
> >> ------------------
> >> Full HYDRA_DEBUG=1 output:
> >>
> >>
> >> host: h234
> >> host: h235
> >>
> >> ==================================================================================================
> >> mpiexec options:
> >> ----------------
> >>   Base path: /apps/mvapich2/1.8-intel/bin/
> >>   Launcher: (null)
> >>   Debug level: 1
> >>   Enable X: -1
> >>
> >>   Global environment:
> >>   -------------------
> >>     USER=ctierney
> >>     LOGNAME=ctierney
> >>     HOME=/home/ctierney
> >>
> >> PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/bin/intel64:/home/ctierney/gmt/bin:/home/ctierney/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/esrl/bin:/opt/sge/default/bin/lx26-amd64:/apps/moab/default/bin:/apps/moab/default/sbin:/home/ctierney/GPU/trunk/GPU/bin:/apps/mvapich2/1.8-intel/bin
> >>     MAIL=/var/spool/mail/ctierney
> >>     SHELL=/bin/tcsh
> >>     SSH_CLIENT=10.178.22.53 58254 22
> >>     SSH_CONNECTION=10.178.22.53 58254 10.178.22.54 22
> >>     SSH_TTY=/dev/pts/0
> >>     TERM=xterm
> >>     HOSTTYPE=x86_64-linux
> >>     VENDOR=unknown
> >>     OSTYPE=linux
> >>     MACHTYPE=x86_64
> >>     SHLVL=1
> >>     PWD=/home/ctierney/OMB-3.1.1
> >>     GROUP=jetmgmt
> >>     HOST=h234
> >>     REMOTEHOST=h233
> >>     HOSTNAME=h234
> >>
> >> LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;3
>  5:*
> >>  .xcf=01;3
> >> 5:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:
> >>     G_BROKEN_FILENAMES=1
> >>     LESSOPEN=|/usr/bin/lesspipe.sh %s
> >>     MODULE_VERSION=3.2.9
> >>     MODULE_VERSION_STACK=3.2.9
> >>     MODULESHOME=/apps/Modules/3.2.9
> >>
> >> MODULEPATH=/apps/Modules/versions:/apps/Modules/$MODULE_VERSION/modulefiles:/apps/Modules/modulefiles:/apps/Modules/default/systemdefaults:/apps/Modules/default/compilers:/apps/Modules/default/systemtools:/apps/Modules/default/datatools:/apps/Modules/default/modulefamilies/intel:/apps/Modules/default/modulefamilies/intel-mvapich2
> >>     LOADEDMODULES=sge:intel/12.1.4:mvapich2/1.8
> >>     QTDIR=/usr/lib64/qt-3.3
> >>     QTINC=/usr/lib64/qt-3.3/include
> >>     QTLIB=/usr/lib64/qt-3.3/lib
> >>
> >> LD_LIBRARY_PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/lib/intel64:/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/compiler/lib/intel64:/opt/sge/default/lib/lx26-amd64
> >>     MANPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/man/en_US:/usr/share/man:/opt/sge/default/man:/apps/moab/default/man
> >>     MOAB=/apps/moab/default
> >>     MOABHOMEDIR=/apps/moab/moabhome
> >>     SGE_ROOT=/opt/sge/default
> >>     _LMFILES_=/apps/Modules/default/systemdefaults/sge:/apps/Modules/default/compilers/intel/12.1.4:/apps/Modules/default/modulefamilies/intel/mvapich2/1.8
> >>     GPUF2C=/home/ctierney/GPU/trunk/GPU
> >>     GMT=/home/ctierney/gmt
> >>     X509_USER_PROXY=/home/ctierney/.x509_user_proxy
> >>     PERL5LIB=/home/ctierney/chron/lib:/home/ctierney/chron/lib/perl5/site_perl
> >>     HPSS_PRINCIPAL=craig.tierney
> >>     HPSS_KEYTAB_PATH=/home/ctierney/hpss/craig.tierney.keytab
> >>     HPSS_SERVER_HOST=hpsscore1.fairmont.rdhpcs.noaa.gov
> >>     HPSS_PFTPC_PORT_RANGE=ncacn_ip_tcp[38000-38999]
> >>     HPSS_HOSTNAME=h234.boulder.rdhpcs.noaa.gov
> >>     HPSS_CFG_FILE_PATH=/home/ctierney
> >>     CPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/include
> >>     FPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/includelp64
> >>     INCLUDE=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/include
> >>     INTEL=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319
> >>     INTEL_LICENSE_FILE=/apps/intel/licenses
> >>     LIBRARY_PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/lib/intel64:/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/compiler/lib/intel64
> >>     MKLROOT=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl
> >>     MKL_LP64_ILP64=lp64
> >>     MKL_TARGET_ARCH=intel64
> >>     MPICH=/apps/mvapich2/1.8-intel
> >>     MVAPICH=/apps/mvapich2/1.8-intel
> >>     HYDRA_DEBUG=1
> >>
> >>   Hydra internal environment:
> >>   ---------------------------
> >>     GFORTRAN_UNBUFFERED_PRECONNECTED=y
> >>
> >>
> >>     Proxy information:
> >>     *********************
> >>       [1] proxy: h234 (1 cores)
> >>       Exec list: ./osu_bw (1 processes);
> >>
> >>       [2] proxy: h235 (1 cores)
> >>       Exec list: ./osu_bw (1 processes);
> >>
> >>
> >> ==================================================================================================
> >>
> >> [mpiexec at h234] Timeout set to -1 (-1 means infinite)
> >> [mpiexec at h234] Got a control port string of h234:51408
> >>
> >> Proxy launch args: /apps/mvapich2/1.8-intel/bin/hydra_pmi_proxy --control-port h234:51408 --debug --rmk user --launcher ssh --demux poll --pgid 0 --retries 10
> >> --proxy-id
> >>
> >> [mpiexec at h234] PMI FD: (null); PMI PORT: (null); PMI ID/RANK: -1
> >> Arguments being passed to proxy 0:
> >> --version 1.4.1p1 --iface-ip-env-name MPICH_INTERFACE_HOSTNAME --hostname h234 --global-core-map 0,1,1 --filler-process-map 0,1,1 --global-process-count 2
> >> --auto-cleanup 1 --pmi-rank -1 --pmi-kvsname kvs_3934_0 --pmi-process-mapping (vector,(0,2,1)) --ckpoint-num -1 --global-inherited-env 59 'USER=ctierney'
> >> 'LOGNAME=ctierney' 'HOME=/home/ctierney'
> >> 'PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/bin/intel64:/home/ctierney/gmt/bin:/home/ctierney/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/esrl/bin:/opt/sge/default/bin/lx26-amd64:/apps/moab/default/bin:/apps/moab/default/sbin:/home/ctierney/GPU/trunk/GPU/bin:/apps/mvapich2/1.8-intel/bin'
> >> 'MAIL=/var/spool/mail/ctierney' 'SHELL=/bin/tcsh' 'SSH_CLIENT=10.178.22.53 58254 22' 'SSH_CONNECTION=10.178.22.53 58254 10.178.22.54 22' 'SSH_TTY=/dev/pts/0'
> >> 'TERM=xterm' 'HOSTTYPE=x86_64-linux' 'VENDOR=unknown' 'OSTYPE=linux' 'MACHTYPE=x86_64' 'SHLVL=1' 'PWD=/home/ctierney/OMB-3.1.1' 'GROUP=jetmgmt' 'HOST=h234'
> >> 'REMOTEHOST=h233' 'HOSTNAME=h234'
> >> 'LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;
>  35:
> >>  *.xcf=01;
> >> 35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:'
> >> 'G_BROKEN_FILENAMES=1' 'LESSOPEN=|/usr/bin/lesspipe.sh %s' 'MODULE_VERSION=3.2.9' 'MODULE_VERSION_STACK=3.2.9' 'MODULESHOME=/apps/Modules/3.2.9'
> >> 'MODULEPATH=/apps/Modules/versions:/apps/Modules/$MODULE_VERSION/modulefiles:/apps/Modules/modulefiles:/apps/Modules/default/systemdefaults:/apps/Modules/default/compilers:/apps/Modules/default/systemtools:/apps/Modules/default/datatools:/apps/Modules/default/modulefamilies/intel:/apps/Modules/default/modulefamilies/intel-mvapich2'
> >> 'LOADEDMODULES=sge:intel/12.1.4:mvapich2/1.8' 'QTDIR=/usr/lib64/qt-3.3' 'QTINC=/usr/lib64/qt-3.3/include' 'QTLIB=/usr/lib64/qt-3.3/lib'
> >> 'LD_LIBRARY_PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/lib/intel64:/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/compiler/lib/intel64:/opt/sge/default/lib/lx26-amd64'
> >> 'MANPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/man/en_US:/usr/share/man:/opt/sge/default/man:/apps/moab/default/man' 'MOAB=/apps/moab/default'
> >> 'MOABHOMEDIR=/apps/moab/moabhome' 'SGE_ROOT=/opt/sge/default'
> >> '_LMFILES_=/apps/Modules/default/systemdefaults/sge:/apps/Modules/default/compilers/intel/12.1.4:/apps/Modules/default/modulefamilies/intel/mvapich2/1.8'
> >> 'GPUF2C=/home/ctierney/GPU/trunk/GPU' 'GMT=/home/ctierney/gmt' 'X509_USER_PROXY=/home/ctierney/.x509_user_proxy'
> >> 'PERL5LIB=/home/ctierney/chron/lib:/home/ctierney/chron/lib/perl5/site_perl' 'HPSS_PRINCIPAL=craig.tierney'
> >> 'HPSS_KEYTAB_PATH=/home/ctierney/hpss/craig.tierney.keytab' 'HPSS_SERVER_HOST=hpsscore1.fairmont.rdhpcs.noaa.gov'
> >> 'HPSS_PFTPC_PORT_RANGE=ncacn_ip_tcp[38000-38999]' 'HPSS_HOSTNAME=h234.boulder.rdhpcs.noaa.gov' 'HPSS_CFG_FILE_PATH=/home/ctierney'
> >> 'CPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/include' 'FPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/includelp64'
> >> 'INCLUDE=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/include' 'INTEL=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319'
> >> 'INTEL_LICENSE_FILE=/apps/intel/licenses'
> >> 'LIBRARY_PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/lib/intel64:/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/compiler/lib/intel64'
> >> 'MKLROOT=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl' 'MKL_LP64_ILP64=lp64' 'MKL_TARGET_ARCH=intel64' 'MPICH=/apps/mvapich2/1.8-intel'
> >> 'MVAPICH=/apps/mvapich2/1.8-intel' 'HYDRA_DEBUG=1' --global-user-env 0 --global-system-env 1 'GFORTRAN_UNBUFFERED_PRECONNECTED=y' --proxy-core-count 1 --exec
> >> --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir /home/ctierney/OMB-3.1.1 --exec-args 1 ./osu_bw
> >>
> >> [mpiexec at h234] PMI FD: (null); PMI PORT: (null); PMI ID/RANK: -1
> >> Arguments being passed to proxy 1:
> >> --version 1.4.1p1 --iface-ip-env-name MPICH_INTERFACE_HOSTNAME --hostname h235 --global-core-map 1,1,0 --filler-process-map 1,1,0 --global-process-count 2
> >> --auto-cleanup 1 --pmi-rank -1 --pmi-kvsname kvs_3934_0 --pmi-process-mapping (vector,(0,2,1)) --ckpoint-num -1 --global-inherited-env 59 'USER=ctierney'
> >> 'LOGNAME=ctierney' 'HOME=/home/ctierney'
> >> 'PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/bin/intel64:/home/ctierney/gmt/bin:/home/ctierney/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/esrl/bin:/opt/sge/default/bin/lx26-amd64:/apps/moab/default/bin:/apps/moab/default/sbin:/home/ctierney/GPU/trunk/GPU/bin:/apps/mvapich2/1.8-intel/bin'
> >> 'MAIL=/var/spool/mail/ctierney' 'SHELL=/bin/tcsh' 'SSH_CLIENT=10.178.22.53 58254 22' 'SSH_CONNECTION=10.178.22.53 58254 10.178.22.54 22' 'SSH_TTY=/dev/pts/0'
> >> 'TERM=xterm' 'HOSTTYPE=x86_64-linux' 'VENDOR=unknown' 'OSTYPE=linux' 'MACHTYPE=x86_64' 'SHLVL=1' 'PWD=/home/ctierney/OMB-3.1.1' 'GROUP=jetmgmt' 'HOST=h234'
> >> 'REMOTEHOST=h233' 'HOSTNAME=h234'
> >> 'LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.tbz=01;31:*.tbz2=01;31:*.bz=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;
>  35:
> >>  *.xcf=01;
> >> 35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=01;36:*.au=01;36:*.flac=01;36:*.mid=01;36:*.midi=01;36:*.mka=01;36:*.mp3=01;36:*.mpc=01;36:*.ogg=01;36:*.ra=01;36:*.wav=01;36:*.axa=01;36:*.oga=01;36:*.spx=01;36:*.xspf=01;36:'
> >> 'G_BROKEN_FILENAMES=1' 'LESSOPEN=|/usr/bin/lesspipe.sh %s' 'MODULE_VERSION=3.2.9' 'MODULE_VERSION_STACK=3.2.9' 'MODULESHOME=/apps/Modules/3.2.9'
> >> 'MODULEPATH=/apps/Modules/versions:/apps/Modules/$MODULE_VERSION/modulefiles:/apps/Modules/modulefiles:/apps/Modules/default/systemdefaults:/apps/Modules/default/compilers:/apps/Modules/default/systemtools:/apps/Modules/default/datatools:/apps/Modules/default/modulefamilies/intel:/apps/Modules/default/modulefamilies/intel-mvapich2'
> >> 'LOADEDMODULES=sge:intel/12.1.4:mvapich2/1.8' 'QTDIR=/usr/lib64/qt-3.3' 'QTINC=/usr/lib64/qt-3.3/include' 'QTLIB=/usr/lib64/qt-3.3/lib'
> >> 'LD_LIBRARY_PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/lib/intel64:/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/compiler/lib/intel64:/opt/sge/default/lib/lx26-amd64'
> >> 'MANPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/man/en_US:/usr/share/man:/opt/sge/default/man:/apps/moab/default/man' 'MOAB=/apps/moab/default'
> >> 'MOABHOMEDIR=/apps/moab/moabhome' 'SGE_ROOT=/opt/sge/default'
> >> '_LMFILES_=/apps/Modules/default/systemdefaults/sge:/apps/Modules/default/compilers/intel/12.1.4:/apps/Modules/default/modulefamilies/intel/mvapich2/1.8'
> >> 'GPUF2C=/home/ctierney/GPU/trunk/GPU' 'GMT=/home/ctierney/gmt' 'X509_USER_PROXY=/home/ctierney/.x509_user_proxy'
> >> 'PERL5LIB=/home/ctierney/chron/lib:/home/ctierney/chron/lib/perl5/site_perl' 'HPSS_PRINCIPAL=craig.tierney'
> >> 'HPSS_KEYTAB_PATH=/home/ctierney/hpss/craig.tierney.keytab' 'HPSS_SERVER_HOST=hpsscore1.fairmont.rdhpcs.noaa.gov'
> >> 'HPSS_PFTPC_PORT_RANGE=ncacn_ip_tcp[38000-38999]' 'HPSS_HOSTNAME=h234.boulder.rdhpcs.noaa.gov' 'HPSS_CFG_FILE_PATH=/home/ctierney'
> >> 'CPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/include' 'FPATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/includelp64'
> >> 'INCLUDE=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/include' 'INTEL=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319'
> >> 'INTEL_LICENSE_FILE=/apps/intel/licenses'
> >> 'LIBRARY_PATH=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl/lib/intel64:/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/compiler/lib/intel64'
> >> 'MKLROOT=/apps/intel/12.1.4/composer_xe_2011_sp1.10.319/mkl' 'MKL_LP64_ILP64=lp64' 'MKL_TARGET_ARCH=intel64' 'MPICH=/apps/mvapich2/1.8-intel'
> >> 'MVAPICH=/apps/mvapich2/1.8-intel' 'HYDRA_DEBUG=1' --global-user-env 0 --global-system-env 1 'GFORTRAN_UNBUFFERED_PRECONNECTED=y' --proxy-core-count 1 --exec
> >> --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir /home/ctierney/OMB-3.1.1 --exec-args 1 ./osu_bw
> >>
> >> [mpiexec at h234] Launch arguments: /apps/mvapich2/1.8-intel/bin/hydra_pmi_proxy --control-port h234:51408 --debug --rmk user --launcher ssh --demux poll --pgid 0
> >> --retries 10 --proxy-id 0
> >> [mpiexec at h234] Launch arguments: /usr/bin/ssh -x h235 "/apps/mvapich2/1.8-intel/bin/hydra_pmi_proxy" --control-port h234:51408 --debug --rmk user --launcher ssh
> >> --demux poll --pgid 0 --retries 10 --proxy-id 1
> >> [proxy:0:0 at h234] got pmi command (from 0): init
> >> pmi_version=1 pmi_subversion=1
> >> [proxy:0:0 at h234] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
> >> [proxy:0:0 at h234] got pmi command (from 0): get_maxes
> >>
> >> [proxy:0:0 at h234] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
> >> [proxy:0:0 at h234] got pmi command (from 0): get_appnum
> >>
> >> [proxy:0:0 at h234] PMI response: cmd=appnum appnum=0
> >> [proxy:0:0 at h234] got pmi command (from 0): get_my_kvsname
> >>
> >> [proxy:0:0 at h234] PMI response: cmd=my_kvsname kvsname=kvs_3934_0
> >> [proxy:0:0 at h234] got pmi command (from 0): get_my_kvsname
> >>
> >> [proxy:0:0 at h234] PMI response: cmd=my_kvsname kvsname=kvs_3934_0
> >> [proxy:0:0 at h234] got pmi command (from 0): get
> >> kvsname=kvs_3934_0 key=PMI_process_mapping
> >> [proxy:0:0 at h234] PMI response: cmd=get_result rc=0 msg=success value=(vector,(0,2,1))
> >> [cli_0]: aborting job:
> >> Fatal error in MPI_Init:
> >> Other MPI error
> >>
> >>
> >> =====================================================================================
> >> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> >> =   EXIT CODE: 256
> >> =   CLEANING UP REMAINING PROCESSES
> >> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> >> =====================================================================================
> >> [proxy:0:1 at h235] got pmi command (from 4): init
> >> pmi_version=1 pmi_subversion=1
> >> [proxy:0:1 at h235] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
> >> [proxy:0:1 at h235] got pmi command (from 4): get_maxes
> >>
> >> [proxy:0:1 at h235] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
> >> [proxy:0:1 at h235] got pmi command (from 4): get_appnum
> >>
> >> [proxy:0:1 at h235] PMI response: cmd=appnum appnum=0
> >> [proxy:0:1 at h235] got pmi command (from 4): get_my_kvsname
> >>
> >> [proxy:0:1 at h235] PMI response: cmd=my_kvsname kvsname=kvs_3934_0
> >> [proxy:0:1 at h235] got pmi command (from 4): get_my_kvsname
> >>
> >> [proxy:0:1 at h235] PMI response: cmd=my_kvsname kvsname=kvs_3934_0
> >> [proxy:0:1 at h235] got pmi command (from 4): get
> >> kvsname=kvs_3934_0 key=PMI_process_mapping
> >> [proxy:0:1 at h235] PMI response: cmd=get_result rc=0 msg=success value=(vector,(0,2,1))
> >> [cli_1]: aborting job:
> >> Fatal error in MPI_Init:
> >> Other MPI error
> >>
> >>
> >> =====================================================================================
> >> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> >> =   EXIT CODE: 256
> >> =   CLEANING UP REMAINING PROCESSES
> >> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> >> =====================================================================================
> >>
> >> _______________________________________________
> >> mvapich-discuss mailing list
> >> mvapich-discuss at cse.ohio-state.edu
> >> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>
> > 
> 
> 

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list