[mvapich-discuss] osu benchmark execution error

Jonathan Perkins perkinjo at cse.ohio-state.edu
Fri Sep 23 10:23:22 EDT 2011


After some troubleshooting offline we were able to determine that the
cause of Hoot's issue was due to a problem with the amount of max
locked memory allowed on his system.  This has been resolved setting
it to unlimited in /etc/security/limits.conf.

On Fri, Sep 16, 2011 at 12:36 PM, Hoot Thompson <hoot at ptpnow.com> wrote:
> I compiled mvapich2 on a pair of ubuntu-11.04 servers, all seemed ok. Now I
> try and run the benchmarks and I get the following error and it's not clear
> where to look to resolve the issue. Note that the benchmarks work fine if
> executed on one host.
>
>
>
> hoot at ubuntu1-bare:~/mvapich/mvapich2-1.7rc1/osu_benchmarks$ mpiexec -np 2
> -host 10.0.0.1,10.0.0.2 ./osu_bw
> Fatal error in MPI_Init:
> Other MPI error
>
> Fatal error in MPI_Init:
> Other MPI error
>
> hoot at ubuntu1-bare:~/mvapich/mvapich2-1.7rc1/osu_benchmarks$ mpiexec -v -np 2
> -host 10.0.0.1,10.0.0.2 ./osu_bw
> host: 10.0.0.1
> host: 10.0.0.2
>
> ==================================================================================================
> mpiexec options:
> ----------------
>   Base path: /usr/local/bin/
>   Launcher: (null)
>   Debug level: 1
>   Enable X: -1
>
>   Global environment:
>   -------------------
>     TERM=xterm
>     SHELL=/bin/bash
>
> XDG_SESSION_COOKIE=3291959ebb2860f6639f79380000c290-1316187385.360979-142487827
>     SSH_CLIENT=169.154.148.10 56349 22
>     SSH_TTY=/dev/pts/0
>     USER=hoot
>
> LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:
>     MAIL=/var/mail/hoot
>
> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
>     PWD=/home/hoot/mvapich/mvapich2-1.7rc1/osu_benchmarks
>     LANG=en_US.UTF-8
>     SHLVL=1
>     HOME=/home/hoot
>     LOGNAME=hoot
>     SSH_CONNECTION=169.154.148.10 56349 169.154.148.23 22
>     LESSOPEN=| /usr/bin/lesspipe %s
>     LESSCLOSE=/usr/bin/lesspipe %s %s
>     OLDPWD=/home/hoot/mvapich/mvapich2-1.7rc1
>     _=/usr/local/bin/mpiexec
>
>   Hydra internal environment:
>   ---------------------------
>     GFORTRAN_UNBUFFERED_PRECONNECTED=y
>
>
>     Proxy information:
>     *********************
>       [1] proxy: 10.0.0.1 (1 cores)
>       Exec list: ./osu_bw (1 processes);
>
>       [2] proxy: 10.0.0.2 (1 cores)
>       Exec list: ./osu_bw (1 processes);
>
>
> ==================================================================================================
>
> [mpiexec at ubuntu1-bare] Timeout set to -1 (-1 means infinite)
> [mpiexec at ubuntu1-bare] Got a control port string of 10.0.0.1:54213
>
> Proxy launch args: /usr/local/bin/hydra_pmi_proxy --control-port
> 10.0.0.1:54213 --debug --demux poll --pgid 0 --retries 10 --proxy-id
>
> [mpiexec at ubuntu1-bare] PMI FD: (null); PMI PORT: (null); PMI ID/RANK: -1
> Arguments being passed to proxy 0:
> --version 1.4 --interface-env-name MPICH_INTERFACE_HOSTNAME --hostname
> 10.0.0.1 --global-core-map 0,1,1 --filler-process-map 0,1,1
> --global-process-count 2 --auto-cleanup 1 --pmi-rank -1 --pmi-kvsname
> kvs_17187_0 --pmi-process-mapping (vector,(0,2,1)) --ckpoint-num -1
> --global-inherited-env 19 'TERM=xterm' 'SHELL=/bin/bash'
> 'XDG_SESSION_COOKIE=3291959ebb2860f6639f79380000c290-1316187385.360979-142487827'
> 'SSH_CLIENT=169.154.148.10 56349 22' 'SSH_TTY=/dev/pts/0' 'USER=hoot'
> 'LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:'
> 'MAIL=/var/mail/hoot'
> 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games'
> 'PWD=/home/hoot/mvapich/mvapich2-1.7rc1/osu_benchmarks' 'LANG=en_US.UTF-8'
> 'SHLVL=1' 'HOME=/home/hoot' 'LOGNAME=hoot' 'SSH_CONNECTION=169.154.148.10
> 56349 169.154.148.23 22' 'LESSOPEN=| /usr/bin/lesspipe %s'
> 'LESSCLOSE=/usr/bin/lesspipe %s %s'
> 'OLDPWD=/home/hoot/mvapich/mvapich2-1.7rc1' '_=/usr/local/bin/mpiexec'
> --global-user-env 0 --global-system-env 1
> 'GFORTRAN_UNBUFFERED_PRECONNECTED=y' --proxy-core-count 1 --exec
> --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir
> /home/hoot/mvapich/mvapich2-1.7rc1/osu_benchmarks --exec-args 1 ./osu_bw
>
> [mpiexec at ubuntu1-bare] PMI FD: (null); PMI PORT: (null); PMI ID/RANK: -1
> Arguments being passed to proxy 1:
> --version 1.4 --interface-env-name MPICH_INTERFACE_HOSTNAME --hostname
> 10.0.0.2 --global-core-map 1,1,0 --filler-process-map 1,1,0
> --global-process-count 2 --auto-cleanup 1 --pmi-rank -1 --pmi-kvsname
> kvs_17187_0 --pmi-process-mapping (vector,(0,2,1)) --ckpoint-num -1
> --global-inherited-env 19 'TERM=xterm' 'SHELL=/bin/bash'
> 'XDG_SESSION_COOKIE=3291959ebb2860f6639f79380000c290-1316187385.360979-142487827'
> 'SSH_CLIENT=169.154.148.10 56349 22' 'SSH_TTY=/dev/pts/0' 'USER=hoot'
> 'LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arj=01;31:*.taz=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.zip=01;31:*.z=01;31:*.Z=01;31:*.dz=01;31:*.gz=01;31:*.lz=01;31:*.xz=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.rar=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.jpg=01;35:*.jpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.axv=01;35:*.anx=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.axa=00;36:*.oga=00;36:*.spx=00;36:*.xspf=00;36:'
> 'MAIL=/var/mail/hoot'
> 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games'
> 'PWD=/home/hoot/mvapich/mvapich2-1.7rc1/osu_benchmarks' 'LANG=en_US.UTF-8'
> 'SHLVL=1' 'HOME=/home/hoot' 'LOGNAME=hoot' 'SSH_CONNECTION=169.154.148.10
> 56349 169.154.148.23 22' 'LESSOPEN=| /usr/bin/lesspipe %s'
> 'LESSCLOSE=/usr/bin/lesspipe %s %s'
> 'OLDPWD=/home/hoot/mvapich/mvapich2-1.7rc1' '_=/usr/local/bin/mpiexec'
> --global-user-env 0 --global-system-env 1
> 'GFORTRAN_UNBUFFERED_PRECONNECTED=y' --proxy-core-count 1 --exec
> --exec-appnum 0 --exec-proc-count 1 --exec-local-env 0 --exec-wdir
> /home/hoot/mvapich/mvapich2-1.7rc1/osu_benchmarks --exec-args 1 ./osu_bw
>
> [mpiexec at ubuntu1-bare] Launch arguments: /usr/local/bin/hydra_pmi_proxy
> --control-port 10.0.0.1:54213 --debug --demux poll --pgid 0 --retries 10
> --proxy-id 0
> [mpiexec at ubuntu1-bare] Launch arguments: /usr/bin/ssh -x 10.0.0.2
> "/usr/local/bin/hydra_pmi_proxy" --control-port 10.0.0.1:54213 --debug
> --demux poll --pgid 0 --retries 10 --proxy-id 1
> [proxy:0:0 at ubuntu1-bare] got pmi command (from 0): init
> pmi_version=1 pmi_subversion=1
> [proxy:0:0 at ubuntu1-bare] PMI response: cmd=response_to_init pmi_version=1
> pmi_subversion=1 rc=0
> [proxy:0:0 at ubuntu1-bare] got pmi command (from 0): get_maxes
>
> [proxy:0:0 at ubuntu1-bare] PMI response: cmd=maxes kvsname_max=256
> keylen_max=64 vallen_max=1024
> [proxy:0:0 at ubuntu1-bare] got pmi command (from 0): get_appnum
>
> [proxy:0:0 at ubuntu1-bare] PMI response: cmd=appnum appnum=0
> [proxy:0:0 at ubuntu1-bare] got pmi command (from 0): get_my_kvsname
>
> [proxy:0:0 at ubuntu1-bare] PMI response: cmd=my_kvsname kvsname=kvs_17187_0
> [proxy:0:0 at ubuntu1-bare] got pmi command (from 0): get_my_kvsname
>
> [proxy:0:0 at ubuntu1-bare] PMI response: cmd=my_kvsname kvsname=kvs_17187_0
> [proxy:0:0 at ubuntu1-bare] got pmi command (from 0): get
> kvsname=kvs_17187_0 key=PMI_process_mapping
> [proxy:0:0 at ubuntu1-bare] PMI response: cmd=get_result rc=0 msg=success
> value=(vector,(0,2,1))
> Fatal error in MPI_Init:
> Other MPI error
>
> [proxy:0:1 at ubuntu2-bare] got pmi command (from 4): init
> pmi_version=1 pmi_subversion=1
> [proxy:0:1 at ubuntu2-bare] PMI response: cmd=response_to_init pmi_version=1
> pmi_subversion=1 rc=0
> [proxy:0:1 at ubuntu2-bare] got pmi command (from 4): get_maxes
>
> [proxy:0:1 at ubuntu2-bare] PMI response: cmd=maxes kvsname_max=256
> keylen_max=64 vallen_max=1024
> [proxy:0:1 at ubuntu2-bare] got pmi command (from 4): get_appnum
>
> [proxy:0:1 at ubuntu2-bare] PMI response: cmd=appnum appnum=0
> [proxy:0:1 at ubuntu2-bare] got pmi command (from 4): get_my_kvsname
>
> [proxy:0:1 at ubuntu2-bare] PMI response: cmd=my_kvsname kvsname=kvs_17187_0
> [proxy:0:1 at ubuntu2-bare] got pmi command (from 4): get_my_kvsname
>
> [proxy:0:1 at ubuntu2-bare] PMI response: cmd=my_kvsname kvsname=kvs_17187_0
> [proxy:0:1 at ubuntu2-bare] got pmi command (from 4): get
> kvsname=kvs_17187_0 key=PMI_process_mapping
> [proxy:0:1 at ubuntu2-bare] PMI response: cmd=get_result rc=0 msg=success
> value=(vector,(0,2,1))
> Fatal error in MPI_Init:
> Other MPI error
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list