[mvapich-discuss] MVAPICH error "connect: Network is unreachable"

Jonathan Perkins perkinjo at cse.ohio-state.edu
Thu Mar 10 19:03:31 EST 2011


Thanks for letting us know.

On Thu, Mar 10, 2011 at 6:46 PM, Robert Jacobi
<rjacobi at email.arizona.edu> wrote:
> Hello Jonathan,
>
> Although it turned out that the mvapich version wasn't the issue, the error
> output from mpiexec.hydra helped me solve the problem. I didn't realize that
> mvapich contrary to openmpi apparently opens a connection from the compute
> node to the launch node.
> Due to a missing route in the network set-up on this compute node that
> connection couldn't be established and hence mpirun failed. Now that I fixed
> that it appears to work.
>
> Robert
>
> Jonathan Perkins wrote:
>>
>> Hello Robert.  There may be some issue related to the version of
>> mpirun_rsh you're using which is now getting outdated.  Our latest
>> changes have gone into mvapich2.  Can you try downloading
>> mvapich2-1.6rc3 from our website
>> (http://mvapich.cse.ohio-state.edu/download/mvapich2/)?  Please try
>> running your program using this with mpirun_rsh and/or mpiexec.hydra
>> to see if you continue to experience this problem.
>>
>> On Mon, Mar 7, 2011 at 9:44 PM, Robert Jacobi <rjacobi at email.arizona.edu>
>> wrote:
>>
>>>
>>> Hello,
>>>
>>> I have a very strange connection error that occurs ONLY when I try to
>>> launch
>>> a program with mvapich (mpirun_rsh) from my head node specifically on
>>> compute node01:
>>> ------------
>>> [18:10]:robert at salvator:~/TEST>mpirun_rsh -np 1 node01 cutiosmpi.mvapich
>>> connect: Network is unreachable
>>>
>>> Child exited abnormally!
>>> Killing remote processes...DONE
>>> [18:10]:robert at salvator:~/TEST>node01: Connection refused
>>> ------------
>>>
>>> To locate the problem I've tried the following (with the same program as
>>> before), which all worked fine:
>>> - launch the program with mvapich (mpirun_rsh) from the head node on any
>>> other compute node
>>> - launch the program with mvapich (mpirun_rsh) from compute node01on
>>> compute
>>> node01
>>> - launch the program with mvapich (mpirun_rsh) from compute node02on
>>> compute
>>> node01
>>> - launch the program with mvapich (mpirun_rsh) from compute node01on the
>>> head node
>>> - launch the program with openmpi (mpirun) from the head node on compute
>>> node01 (compiled with openmpi)
>>>
>>> I made sure that:
>>> i) the connection is set-up such that I can ssh passwordless into all
>>> compute nodes (including node01) from the head node (and all the compute
>>> nodes into each other)
>>> ii) the compute nodes are in the hosts file with the right name
>>> iii) mvapich is loaded:
>>> ------------
>>> [18:16]:robert at salvator:~>mpi-selector --query
>>> default:mvapich_intel-1.2.0
>>> level:user
>>> [18:16]:robert at salvator:~>which mpirun_rsh
>>> /usr/mpi/intel/mvapich-1.2.0/bin/mpirun_rsh
>>> ------------
>>> Rebooting both the head node and the compute node01 didn't change
>>> anything.
>>> I've also tried this with benchmark tools and got the same error (both
>>> were
>>> compiled with mvapich and intel compiler).
>>>
>>> We're running RHEL5.5, 2.6.18-194.el5 x86_64 and use InfiniBand (Mellanox
>>> switch). The MVAPICH library came with the Mellanox OFED firmware tools
>>> ("mft-2.6.2-10" downloaded from
>>>
>>> http://mellanox.com/content/pages.php?pg=management_tools&menu_section=34)
>>>
>>> Thank You in advance for your help and please let me know if you need
>>> further information!
>>> I couldn't find any help for this error in the user guides or online and
>>> I'm
>>> at a complete loss how to go about fixing it.
>>>
>>> Robert
>>>
>>> --
>>> Robert Jacobi
>>> Research Assistant
>>> University of Arizona
>>> Department of Aerospace & Mechanical Engineering
>>> 1130 N. Mountain Ave.
>>> Tucson, AZ, 85721-0119
>>>
>>> tel: +1 (520) 621 4369
>>> mail: rjacobi at email.arizona.edu
>>>
>>>
>>> The less time you spent on algebra in life, the more time you have to be
>>> a
>>> happy person. (Kerschen)
>>>
>>> Doubt is not a pleasant condition, but certainty is absurd. (Voltaire)
>>>
>>> All great truths begin as blasphemies. (Shaw)
>>>
>>> Denken ist etwas, das auf Schwierigkeiten folgt und dem das Handeln
>>> vorausgeht.(Brecht)
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>>>
>>
>>
>>
>>
>
> --
> Robert Jacobi
> Research Assistant
> University of Arizona
> Department of Aerospace & Mechanical Engineering
> 1130 N. Mountain Ave.
> Tucson, AZ, 85721-0119
>
> tel: +1 (520) 621 4369
> mail: rjacobi at email.arizona.edu
>
>
> The less time you spent on algebra in life, the more time you have to be a
> happy person. (Kerschen)
>
> Doubt is not a pleasant condition, but certainty is absurd. (Voltaire)
>
> All great truths begin as blasphemies. (Shaw)
>
> Denken ist etwas, das auf Schwierigkeiten folgt und dem das Handeln
> vorausgeht.(Brecht)
>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list