[mvapich-discuss] mpiexec.hydra error - unable to connect from
compute nodes to master port 42773
Jonathan Perkins
perkinjo at cse.ohio-state.edu
Tue Sep 13 18:58:48 EDT 2011
You may want to ask your System Administrator if there have been any
system changes that could have caused this as well.
On Tue, Sep 13, 2011 at 6:56 PM, <bright.yang at vaisala.com> wrote:
> You are right. The port # is randomly selected each time. It was working before but broken now. I need to google how to check firewall...
>
> Bright Yang
>
> -----Original Message-----
> From: Jonathan Perkins [mailto:perkinjo at cse.ohio-state.edu]
> Sent: Tuesday, September 13, 2011 4:48 PM
> To: Yang Bright BRYA
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: Re: [mvapich-discuss] mpiexec.hydra error - unable to connect from compute nodes to master port 42773
>
> Hi, is this a new installation or an installation that was previously
> working? The error message suggests checking for firewalls which may
> prevent the launched processes from connecting back to the head node.
> I believe this port number is randomly chosen and may be different in
> between consecutive runs.
>
> On Tue, Sep 13, 2011 at 5:24 PM, <bright.yang at vaisala.com> wrote:
>> Here is what I got when trying to run a parallel job -
>>
>> # mpiexec.hydra -f 2_12hosts -n 24 ./wrf.exe
>>
>> [proxy:0:1 at compute-0-1.local] HYDU_sock_connect (./utils/sock/sock.c:188):
>> unable to connect from "compute-0-1.local" to "kratos.vaisala.com"
>> (Connection timed out)
>>
>> [proxy:0:1 at compute-0-1.local] main (./pm/pmiserv/pmip.c:205): unable to
>> connect to server kratos.vaisala.com at port 42773 (check for firewalls!)
>>
>> [proxy:0:0 at compute-0-0.local] HYDU_sock_connect (./utils/sock/sock.c:188):
>> unable to connect from "compute-0-0.local" to "kratos.vaisala.com"
>> (Connection timed out)
>>
>> [proxy:0:0 at compute-0-0.local] main (./pm/pmiserv/pmip.c:205): unable to
>> connect to server kratos.vaisala.com at port 42773 (check for firewalls!)
>>
>>
>>
>> Where the port 42773 is specified as a configure file? I tried a netstat,
>> there is no listener for that port #
>>
>>
>>
>> netstat --tcp --udp --listening --program
>>
>> Active Internet connections (only servers)
>>
>> Proto Recv-Q Send-Q Local Address Foreign Address
>> State PID/Program name
>>
>> tcp 0 0 *:40000 *:*
>> LISTEN 5394/mysqld
>>
>> tcp 0 0 *:nfs *:*
>> LISTEN -
>>
>> tcp 0 0 *:805 *:*
>> LISTEN 5290/rpc.mountd
>>
>> tcp 0 0 *:54438 *:*
>> LISTEN 6994/pgroupd
>>
>> tcp 0 0 localhost.localdomain:smux *:*
>> LISTEN 5152/snmpd
>>
>> tcp 0 0 *:8649 *:*
>> LISTEN 6224/gmond
>>
>> tcp 0 0 *:52010 *:*
>> LISTEN -
>>
>> tcp 0 0 *:8651 *:*
>> LISTEN 4652/gmetad
>>
>> tcp 0 0 localhost.localdomain:5900 *:*
>> LISTEN 5953/Xorg
>>
>> tcp 0 0 *:8652 *:*
>> LISTEN 4652/gmetad
>>
>> tcp 0 0 *:941 *:*
>> LISTEN 4575/rpc.statd
>>
>> tcp 0 0 *:sunrpc *:*
>> LISTEN 4510/portmap
>>
>> tcp 0 0 10.10.120.21:domain *:*
>> LISTEN 4481/named
>>
>> tcp 0 0 kratos.local:domain *:*
>> LISTEN 4481/named
>>
>> tcp 0 0 localhost.localdomai:domain *:*
>> LISTEN 4481/named
>>
>> tcp 0 0 *:27000 *:*
>> LISTEN 6993/lmgrd
>>
>> tcp 0 0 *:opalis-rdv *:*
>> LISTEN 4997/sge_qmaster
>>
>> tcp 0 0 *:smtp *:*
>> LISTEN 5472/master
>>
>> tcp 0 0 localhost.localdomain:rndc *:*
>> LISTEN 4481/named
>>
>> tcp 0 0 *:734 *:*
>> LISTEN 5218/rpc.rquotad
>>
>> tcp 0 0 *:http *:*
>> LISTEN 5484/httpd
>>
>> tcp 0 0 *:ssh *:*
>> LISTEN 29581/sshd
>>
>> tcp 0 0 *:https *:*
>> LISTEN 5484/httpd
>>
>> udp 0 0 *:nfs
>> *:* -
>>
>> udp 0 0 *:syslog
>> *:* 4362/syslogd
>>
>> udp 0 0 *:9632
>> *:* 5685/tracker-server
>>
>> udp 0 0 *:snmp
>> *:* 5152/snmpd
>>
>> udp 0 0 *:802
>> *:* 5290/rpc.mountd
>>
>> udp 0 0 *:935
>> *:* 4575/rpc.statd
>>
>> udp 0 0 *:938
>> *:* 4575/rpc.statd
>>
>> udp 0 0 10.10.120.21:domain
>> *:* 4481/named
>>
>> udp 0 0 kratos.local:domain
>> *:* 4481/named
>>
>> udp 0 0 localhost.locald:domain
>> *:* 4481/named
>>
>> udp 0 0 *:bootps
>> *:* 5509/dhcpd
>>
>> udp 0 0 *:tftp
>> *:* 5179/xinetd
>>
>> udp 0 0 *:8649
>> *:* 6224/gmond
>>
>> udp 0 0 *:47439
>> *:* -
>>
>> udp 0 0 *:netviewdm3
>> *:* 5218/rpc.rquotad
>>
>> udp 0 0 *:sunrpc *:*
>> 4510/portmap
>>
>> udp 0 0 10.10.120.21:ntp
>> *:* 20331/ntpd
>>
>> udp 0 0 kratos.local:ntp
>> *:* 20331/ntpd
>>
>> udp 0 0 localhost.localdomain:ntp
>> *:* 20331/ntpd
>>
>> udp 0 0 *:ntp
>> *:* 20331/ntpd
>>
>> udp 0 0 fe80::225:90ff:fe19:cdf:ntp
>> *:* 20331/ntpd
>>
>> udp 0 0 fe80::225:90ff:fe19:cde:ntp
>> *:* 20331/ntpd
>>
>> udp 0 0 localhost:ntp *:*
>> 20331/ntpd
>>
>> udp 0 0 *:ntp
>> *:* 20331/ntpd
>>
>>
>>
>> Thanks.
>>
>> Bright Yang
>>
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>
>
>
> --
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo
>
>
--
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
More information about the mvapich-discuss
mailing list