[mvapich-discuss] mpiexec.hydra error - unable to connect from
compute nodes to master port 42773
bright.yang at vaisala.com
bright.yang at vaisala.com
Tue Sep 13 18:56:50 EDT 2011
You are right. The port # is randomly selected each time. It was working before but broken now. I need to google how to check firewall...
Bright Yang
-----Original Message-----
From: Jonathan Perkins [mailto:perkinjo at cse.ohio-state.edu]
Sent: Tuesday, September 13, 2011 4:48 PM
To: Yang Bright BRYA
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] mpiexec.hydra error - unable to connect from compute nodes to master port 42773
Hi, is this a new installation or an installation that was previously
working? The error message suggests checking for firewalls which may
prevent the launched processes from connecting back to the head node.
I believe this port number is randomly chosen and may be different in
between consecutive runs.
On Tue, Sep 13, 2011 at 5:24 PM, <bright.yang at vaisala.com> wrote:
> Here is what I got when trying to run a parallel job -
>
> # mpiexec.hydra -f 2_12hosts -n 24 ./wrf.exe
>
> [proxy:0:1 at compute-0-1.local] HYDU_sock_connect (./utils/sock/sock.c:188):
> unable to connect from "compute-0-1.local" to "kratos.vaisala.com"
> (Connection timed out)
>
> [proxy:0:1 at compute-0-1.local] main (./pm/pmiserv/pmip.c:205): unable to
> connect to server kratos.vaisala.com at port 42773 (check for firewalls!)
>
> [proxy:0:0 at compute-0-0.local] HYDU_sock_connect (./utils/sock/sock.c:188):
> unable to connect from "compute-0-0.local" to "kratos.vaisala.com"
> (Connection timed out)
>
> [proxy:0:0 at compute-0-0.local] main (./pm/pmiserv/pmip.c:205): unable to
> connect to server kratos.vaisala.com at port 42773 (check for firewalls!)
>
>
>
> Where the port 42773 is specified as a configure file? I tried a netstat,
> there is no listener for that port #
>
>
>
> netstat --tcp --udp --listening --program
>
> Active Internet connections (only servers)
>
> Proto Recv-Q Send-Q Local Address Foreign Address
> State PID/Program name
>
> tcp 0 0 *:40000 *:*
> LISTEN 5394/mysqld
>
> tcp 0 0 *:nfs *:*
> LISTEN -
>
> tcp 0 0 *:805 *:*
> LISTEN 5290/rpc.mountd
>
> tcp 0 0 *:54438 *:*
> LISTEN 6994/pgroupd
>
> tcp 0 0 localhost.localdomain:smux *:*
> LISTEN 5152/snmpd
>
> tcp 0 0 *:8649 *:*
> LISTEN 6224/gmond
>
> tcp 0 0 *:52010 *:*
> LISTEN -
>
> tcp 0 0 *:8651 *:*
> LISTEN 4652/gmetad
>
> tcp 0 0 localhost.localdomain:5900 *:*
> LISTEN 5953/Xorg
>
> tcp 0 0 *:8652 *:*
> LISTEN 4652/gmetad
>
> tcp 0 0 *:941 *:*
> LISTEN 4575/rpc.statd
>
> tcp 0 0 *:sunrpc *:*
> LISTEN 4510/portmap
>
> tcp 0 0 10.10.120.21:domain *:*
> LISTEN 4481/named
>
> tcp 0 0 kratos.local:domain *:*
> LISTEN 4481/named
>
> tcp 0 0 localhost.localdomai:domain *:*
> LISTEN 4481/named
>
> tcp 0 0 *:27000 *:*
> LISTEN 6993/lmgrd
>
> tcp 0 0 *:opalis-rdv *:*
> LISTEN 4997/sge_qmaster
>
> tcp 0 0 *:smtp *:*
> LISTEN 5472/master
>
> tcp 0 0 localhost.localdomain:rndc *:*
> LISTEN 4481/named
>
> tcp 0 0 *:734 *:*
> LISTEN 5218/rpc.rquotad
>
> tcp 0 0 *:http *:*
> LISTEN 5484/httpd
>
> tcp 0 0 *:ssh *:*
> LISTEN 29581/sshd
>
> tcp 0 0 *:https *:*
> LISTEN 5484/httpd
>
> udp 0 0 *:nfs
> *:* -
>
> udp 0 0 *:syslog
> *:* 4362/syslogd
>
> udp 0 0 *:9632
> *:* 5685/tracker-server
>
> udp 0 0 *:snmp
> *:* 5152/snmpd
>
> udp 0 0 *:802
> *:* 5290/rpc.mountd
>
> udp 0 0 *:935
> *:* 4575/rpc.statd
>
> udp 0 0 *:938
> *:* 4575/rpc.statd
>
> udp 0 0 10.10.120.21:domain
> *:* 4481/named
>
> udp 0 0 kratos.local:domain
> *:* 4481/named
>
> udp 0 0 localhost.locald:domain
> *:* 4481/named
>
> udp 0 0 *:bootps
> *:* 5509/dhcpd
>
> udp 0 0 *:tftp
> *:* 5179/xinetd
>
> udp 0 0 *:8649
> *:* 6224/gmond
>
> udp 0 0 *:47439
> *:* -
>
> udp 0 0 *:netviewdm3
> *:* 5218/rpc.rquotad
>
> udp 0 0 *:sunrpc *:*
> 4510/portmap
>
> udp 0 0 10.10.120.21:ntp
> *:* 20331/ntpd
>
> udp 0 0 kratos.local:ntp
> *:* 20331/ntpd
>
> udp 0 0 localhost.localdomain:ntp
> *:* 20331/ntpd
>
> udp 0 0 *:ntp
> *:* 20331/ntpd
>
> udp 0 0 fe80::225:90ff:fe19:cdf:ntp
> *:* 20331/ntpd
>
> udp 0 0 fe80::225:90ff:fe19:cde:ntp
> *:* 20331/ntpd
>
> udp 0 0 localhost:ntp *:*
> 20331/ntpd
>
> udp 0 0 *:ntp
> *:* 20331/ntpd
>
>
>
> Thanks.
>
> Bright Yang
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
--
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
More information about the mvapich-discuss
mailing list