[mvapich-discuss] can't set up mpd ring between two nodes (fwd)

wei huang huanwei at cse.ohio-state.edu
Wed Dec 26 11:22:19 EST 2007


Hi,

Thanks for using mvapich2.

> 	I installed mvapich2 , which is with the OFED 1.2.5.

So do you use the default installation coming with the OFED package?

> 	1. when I use mpdboot on a machine, I got :
> 	  mpdboot_inode02 (handle_mpd_output 359): failed to ping mpd on inode02; recvd output={}

There are multiple reasons which can cause this failure. But there are few
things to check first:

1) Do you have other mpd running on the same set of nodes? (under the same
user name)

2) Do you have .mpd.conf in your home directory?

I also want to mention that we have already released mvapich2-1.0.1. You
can try that by downloading the software package from our website:

http://mvapich.cse.ohio-state.edu/

There is a file called README_MPICH2 in the package. You can also read
that for more details regarding set up mpd rings.

Please let us know if this works.

Thanks.

-- Wei

> 	2.  when I try to use mpd to set up mpd ring, as the user guide of mpich2:
> 			mpd &                       on node02
> 			mpd -h node02 -p port       on node01
> 	I got:
> on node01:  (the latter mpd)
> inode01_33435 (connect_lhs 621): invalid challenge from inode02 32969: {}
> inode01_33435 (enter_ring 566): lhs connect failed
> inode01_33435 (run 233): failed to enter ring
>
> on node02:  (the first mpd )
>
> inode02_32969: mpd_uncaught_except_tb handling:
>   exceptions.TypeError: sequence item 0: expected string, int found
>     /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpdlib.py  733  handle_ring_listener_connection
>         newsock.correctChallengeResponse = \
>     /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpdlib.py  488  handle_active_streams        handler(stream,*args)
>     /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd  266  runmainloop
>         rv = self.streamHandler.handle_active_streams(timeout=8.0)
>     /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd  240  run
>         self.runmainloop()
>     /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd  1344  ?
>         mpd.run()
>
>
>
> Has anyone encountered this problem?
> Thanks in advance.
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list