[mvapich-discuss] Re: confirm
43b1c24702cc8a5af97208dd4d163c4ca64ede44
Dan Kokron
daniel.kokron at nasa.gov
Thu May 6 15:26:00 EDT 2010
I hope this gets attached to the proper thread.
I wanted to say that I am seeing the exact same error as Battalgazi
YILDIRIM when using mvapich2-1.4.1. It runs fine up to 64 processes and
dies with the following using more than 64.
mpirun_rsh -hostfile /var/spool/PBS/aux/3671575.borgmg -np 96 ./GEOSgcm.x
Fatal error in MPI_Init_thread:
Other MPI error, error stack:
MPIR_Init_thread(311)..: Initialization failed
MPID_Init(191).........: channel initialization failed
MPIDI_CH3_Init(156)....:
MPIDI_CH3I_CM_Init(993): Error initializing MVAPICH2 MPIU_Malloc library
I am able to run the application on more than 64 processes if mvapich2
is compiled with the ch3:sock channel.
./configure CC=gcc CXX=g++ F77=ifort F90=ifort
--prefix=/discover/nobackup/dkokron/mvapich2-1.4.1_debug_gcc_sock
--enable-g=all --enable-f77 --enable-f90 --enable-cxx --enable-romio
--with-device=ch3:sock
Also fails when compiled with gen2 without rdma_cm
./configure CC=gcc CXX=g++ F77=ifort F90=ifort
--prefix=/discover/nobackup/dkokron/mvapich2-1.4.1_debug_gcc_gen2-cm
--enable-g=all --enable-f77 --enable-f90 --enable-cxx --enable-mpe
--enable-romio --enable-threads=multiple --with-rdma=gen2
--enable-rdma-cm=no
I added some debug print to
src/mpid/ch3/channels/mrail/src/memory/mem_hooks.c
to see where in mvapich2_minit it is failing. I'll report back with any
news.
--
Dan Kokron
Global Modeling and Assimilation Office
NASA Goddard Space Flight Center
Greenbelt, MD 20771
Daniel.S.Kokron at nasa.gov
Phone: (301) 614-5192
Fax: (301) 614-5304
More information about the mvapich-discuss
mailing list