[mvapich-discuss] mvapich2, channel initialization failed

Janusz Mordarski janusz.mordarski at uj.edu.pl
Mon May 31 09:37:17 EDT 2010


hello, i built a cluster, CentOS 5.4 , x86_64 with OFED 1.5.2-beta1 
(mvapich2-1.4.1) , and mpiexec 0.84-pre (0.83 from svn)
when i try to run something over infiniband, through PBS and mpiexec, it 
fails
=============

[unset]: connect failed with connection refused
[unset]: Unable to connect to nb2 on 51390
Fatal error in MPI_Init:
Other MPI error, error stack:
MPIR_Init_thread(311): Initialization failed
MPID_Init(168).......: channel initialization failed
MPID_Init(464).......: PMI_Init returned -1

=============

i try to run it on two nodes, nb1 and nb2
firewall is turned off
i can ping any host from any, on ethernet as well as on infiniband addresses

everything works fine on my older cluster,, but with CentOS 5.2/5.3 and 
OFED 1.5

it has something to do with 'device/channel/interface' settings? or with 
name resolution?

best regards,
Janusz

-- 
Dept of Computational Biophysics&  Bioinformatics,
Faculty of Biochemistry, Biophysics and Biotechnology,
Jagiellonian University,
ul. Gronostajowa 7,
30-387 Krakow, Poland.
Tel: +48 519-353-198
     (+48-12)-664-6380



More information about the mvapich-discuss mailing list