[mvapich-discuss] failure during Init of 3456 process job
Dan Kokron
daniel.kokron at nasa.gov
Sat Mar 12 16:35:05 EST 2011
I am getting a failure during MPI_Init using mvapich2-1.6rc3 configured with
./configure CC=icc CXX=icpc F77=ifort F90=ifort CFLAGS=-fpic -DRDMA_CM
CXXFLAGS=-fpic -DRDMA_CM FFLAGS=-fpic F90FLAGS=-fpic
--prefix=/u/dkokron/play/mvapich2-1.6rc3/install.dbg --enable-f77
--enable-f90 --enable-cxx --enable-mpe --enable-romio
--with-file-system=lustre --enable-threads=default --with-rdma=gen2
--with-hwloc --enable-error-checking=all --enable-error-messages=all
--enable-g=all --enable-fast=none
mpirun_rsh -hostfile $PBS_NODEFILE -np 3456 GEOSgcm.x
Word too long.
child_handler: Error in init phase...wait for cleanup! (0/1mpispawn connections)
Failed in initilization phase, cleaned up all the mpispawn!
Seems others have seem this too since mvapich-1.4.
http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2009-November/002623.html
I tried setting MV2_FASTSSH_THRESHOLD, but that did not help this 288
node job. Any ideas?
mpirun_rsh -hostfile /var/spool/pbs/aux/1576484.pbspl1.nas.nasa.gov -np
3456 MV2_FASTSSH_THRESHOLD=512 ./GEOSgcm.x
Word too long.
child_handler: Error in init phase...wait for cleanup! (0/1mpispawn
connections)
Thanks
Dan
--
More information about the mvapich-discuss
mailing list