[mvapich-discuss] MVAPICH error "connect: Network is unreachable"

Robert Jacobi rjacobi at email.arizona.edu
Mon Mar 7 21:44:34 EST 2011


Hello,

I have a very strange connection error that occurs ONLY when I try to 
launch a program with mvapich (mpirun_rsh) from my head node 
specifically on compute node01:
------------
[18:10]:robert at salvator:~/TEST>mpirun_rsh -np 1 node01 cutiosmpi.mvapich
connect: Network is unreachable

Child exited abnormally!
Killing remote processes...DONE
[18:10]:robert at salvator:~/TEST>node01: Connection refused
------------

To locate the problem I've tried the following (with the same program as 
before), which all worked fine:
- launch the program with mvapich (mpirun_rsh) from the head node on any 
other compute node
- launch the program with mvapich (mpirun_rsh) from compute node01on 
compute node01
- launch the program with mvapich (mpirun_rsh) from compute node02on 
compute node01
- launch the program with mvapich (mpirun_rsh) from compute node01on the 
head node
- launch the program with openmpi (mpirun) from the head node on compute 
node01 (compiled with openmpi)

I made sure that:
i) the connection is set-up such that I can ssh passwordless into all 
compute nodes (including node01) from the head node (and all the compute 
nodes into each other)
ii) the compute nodes are in the hosts file with the right name
iii) mvapich is loaded:
------------
[18:16]:robert at salvator:~>mpi-selector --query
default:mvapich_intel-1.2.0
level:user
[18:16]:robert at salvator:~>which mpirun_rsh
/usr/mpi/intel/mvapich-1.2.0/bin/mpirun_rsh
------------
Rebooting both the head node and the compute node01 didn't change 
anything. I've also tried this with benchmark tools and got the same 
error (both were compiled with mvapich and intel compiler).

We're running RHEL5.5, 2.6.18-194.el5 x86_64 and use InfiniBand 
(Mellanox switch). The MVAPICH library came with the Mellanox OFED 
firmware tools ("mft-2.6.2-10" downloaded from 
http://mellanox.com/content/pages.php?pg=management_tools&menu_section=34)

Thank You in advance for your help and please let me know if you need 
further information!
I couldn't find any help for this error in the user guides or online and 
I'm at a complete loss how to go about fixing it.

Robert

-- 
Robert Jacobi
Research Assistant
University of Arizona
Department of Aerospace & Mechanical Engineering
1130 N. Mountain Ave.
Tucson, AZ, 85721-0119

tel: +1 (520) 621 4369
mail: rjacobi at email.arizona.edu


The less time you spent on algebra in life, the more time you have to be a happy person. (Kerschen)

Doubt is not a pleasant condition, but certainty is absurd. (Voltaire)

All great truths begin as blasphemies. (Shaw)

Denken ist etwas, das auf Schwierigkeiten folgt und dem das Handeln vorausgeht.(Brecht)



More information about the mvapich-discuss mailing list