[mvapich-discuss] Can you help me ?

Qi Gao gaoq at cse.ohio-state.edu
Tue May 15 14:43:53 EDT 2007


Hi,

Thanks for trying MVAPICH2. We are glad to work with you to solve the problem.

>From the log, it seems that MPI program is having trouble to connect to MPD for CKPT support. To help us to narrow down the problem would you plesae first verify the following things?
    1.  Are you using the make script "make.mvapich2.ofa" to compile & install MVAPICH2 and enabling CKPT by assigning the env in the script ENABLE_CKPT to yes?
    2.  Are you using the MPD provided in MVAPICH2 package as your process manager? That version of MPD includes the extended functionality to support CKPT.

Thanks.
--Qi

  ----- Original Message ----- 
  From: sunway qilu 
  To: mvapich-discuss at cse.ohio-state.edu 
  Sent: Monday, May 14, 2007 9:32 PM
  Subject: [mvapich-discuss] Can you help me ?


  Hello, 
  I have the same problem with the blcr and mvapich2-0.98 ,following is  error which I have:

  [fortest at fes01 ~]$ mpdboot -n 2  
  [fortest at fes01 ~]$ ps ax|grep mpd
  13976 ?        S      0:00 python2.3 /usr/local/mvapich2p2/bin/mpd.py --ncpus=1 -e -d
  13977 pts/16   S      0:00 ssh -x -n gn01 /usr/local/mvapich2p2/bin/mpd.py  -h fes01 -p 46910  --ncpus=1 -e -d 
  14003 pts/16   S+     0:00 grep mpd

  [fortest at fes01 ~]$ mpiexec -n 2 ./cpi
  [Rank 0][cr.c: line 124]connect 24678 failed
  rank 0 in job 1  fes01_46910   caused collective abort of all ranks
    exit status of rank 0: killed by signal 9 


  following is the config file  of the user fortest:

  [fortest at fes01 ~]$ cat .bashrc
  # .bashrc

  # User specific aliases and functions

  # Source global definitions
  if [ -f /etc/bashrc ]; then
          . /etc/bashrc
  fi
  export MPI_ROOT=/usr/local/mvapich2p2
  export PATH=$MPI_ROOT/bin:$PATH
  export MV2_CKPT_FILE=/home/fortest/ckptfile
  export MV2_CKPT_INTERVAL=20 
  export MV2_CKPT_MAX_SAVE_CKPTS=3
  export MV2_CKPT_MPD_BASE_PORT=24678
  export MV2_CKPT_MPIEXEC_PORT=14678
  export VIADEV_DEFAULT_TIME_OUT=16


  [fortest at fes01 ~]$ cat mpd.hosts
  gn01
  gn02
  gn03
  gn04
  gn05
  gn06
  gn07
  gn08

  I wait for your answers and thanks a lot.





------------------------------------------------------------------------------


  _______________________________________________
  mvapich-discuss mailing list
  mvapich-discuss at cse.ohio-state.edu
  http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070515/9c06543d/attachment.html


More information about the mvapich-discuss mailing list