[mvapich-discuss] mpirun_rsh error

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Apr 24 15:11:26 EDT 2013


On Wed, Apr 24, 2013 at 02:48:00PM -0400, Malek Musleh wrote:
> I disabled iptables on the nodes I was trying to run, and it worked, so
> firewalls was the issue. That being said, I need to find a long term
> solution to be able to have the firewalls active and run mpi experiments
> across the inifiband ports. Would it be as simple as hacking the source
> code to have MVAPICH use a fixed/hard-coded port number instead of
> performing a system call?

I think it'll be easier to setup iptables to allow connections between
your compute nodes (for instance allow connections between
192.168.1.1/16) than to try to modify MVAPICH2 to use a set range of
ports for incoming connections.

Setting the ports to a restricted range for mpirun_rsh and mpispawn is
possible but will require logic to find a new port to use if the
"standard" one is in use already.

It looks like you can also try using mpiexec (hydra) with the
MPIEXEC_PORT_RANGE variable if you want to restrict the ports used by
the launcher.

http://wiki.mpich.org/mpich/index.php/Using_the_Hydra_Process_Manager#Environment_Settings

Please let me know if this works.

> 
> Malek
> 
> 
> On Wed, Apr 24, 2013 at 2:20 PM, Jonathan Perkins <
> perkinjo at cse.ohio-state.edu> wrote:
> 
> > On Wed, Apr 24, 2013 at 02:13:45PM -0400, Malek Musleh wrote:
> > > Yes, there are active firewalls on the network, I forgot about that
> > > potentially being an issue. I don't think it's permissble for me to
> > disable
> > > the firewalls completely, which ports can AVIMPICH be configured to use?
> >
> > MVAPICH2 doesn't use a predetermined port but one that is given to it by
> > the OS when requested.  You don't need to disable the firewall but can
> > you allow connections between machines that are on your network?
> >
> > >
> > > The missing configure file was when I did an svn checkout, the branch
> > > tarball (where the install worked) is the one I am using.
> >
> > Thanks.  There are also tarballs for the trunk branch however that won't
> > resolve the issue that you're facing with the firewall setup.
> >
> > >
> > > Malek
> > >
> > >
> > > On Wed, Apr 24, 2013 at 2:10 PM, Jonathan Perkins <
> > > perkinjo at cse.ohio-state.edu> wrote:
> > >
> > > > On Wed, Apr 24, 2013 at 01:33:29PM -0400, Malek Musleh wrote:
> > > > > Hi,
> > > > >
> > > > > I am encountering a problem when running mpirun_rsh across any
> > external
> > > > > node (any node besides the host machine from where it is launched).
> > > > >
> > > > > This is the command line I used:
> > > > >
> > > > > mpirun_rsh -np 1 10.2.4.4 ./helloworld
> > > > >
> > > > > (where the ipaddress is not the ip address of the current host). I am
> > > > able
> > > > > to ssh directly (without password) to the machine, so I am not sure
> > why
> > > > > connectivity is an issue.
> > > > >
> > > > > I get the following error:
> > > > >
> > > > > [gpu6.east.isi.edu:mpirun_rsh][mpispawn_checkin] connect() failed:
> > > >  (113)
> > > > > [gpu6.east.isi.edu:mpirun_rsh][wfe_thread] Internal error:
> > transition
> > > > failed
> > > >
> > > > Is there an active firewall on the machines?  It looks like the connect
> > > > call is failing when mpirun_rsh is trying to respond back to the remote
> > > > node that just checked in.
> > > >
> > > > >
> > > > > Likewise, when I run the command on the node B to issue onto node A,
> > the
> > > > > same error occurs. Both machines have mvapich installed, and paths
> > are
> > > > set
> > > > > up as well.
> > > > >
> > > > > The revision I am using is: mvapich2-1.8-r5827
> > > > >
> > > > > This is not the latest, but when I tried the latest, it didn't have a
> > > > > ./configure file, so I opted to try a branch version instead hoping
> > it
> > > > was
> > > > > more stable.
> > > >
> > > > Did you download a tarball or did you use svn to grab the latest code?
> > > > If you downloaded a tarball missing configure please let us know which
> > > > one is broken.  Everything looks fine with the tarballs for rc1 as well
> > > > as the nightly tarballs for 1.8 and trunk.
> > > >
> > > > >
> > > > > Any ideas, google search wasn't quite helpful.
> > > > >
> > > > > Malek
> > > >
> > > > > _______________________________________________
> > > > > mvapich-discuss mailing list
> > > > > mvapich-discuss at cse.ohio-state.edu
> > > > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > > >
> > > >
> > > > --
> > > > Jonathan Perkins
> > > > http://www.cse.ohio-state.edu/~perkinjo
> > > >
> >
> > --
> > Jonathan Perkins
> > http://www.cse.ohio-state.edu/~perkinjo
> >

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo


More information about the mvapich-discuss mailing list