[mvapich-discuss] can't set up mpd ring between two nodes (fwd)
jetspeed
ibatis2 at 163.com
Thu Dec 27 22:33:18 EST 2007
!!! It works! the MPD ring is set up rightly.
Thanks a lot.
I will check the MPI communication on InfiniBand, Thanks for the help I got from this mail list.
On Thu, 27 Dec 2007 11:01:56 -0500 (EST)
Matthew Koop <koop at cse.ohio-state.edu> wrote:
>
> Can you try changing your secret word in .mpd.conf to start with a letter
> instead of a digit? (i.e. 'c1111' instead of '1111').
>
> Thanks,
>
> Matt
>
> On Thu, 27 Dec 2007, jetspeed wrote:
>
> > Hi, huanwei:
> > Thanks for your reply.
> > 1. There is no other mpd running on the same node, I mpdallexit before I start mpd.
> > 2. there is .mpd.conf in the home directory, the two nodes share the $HOME by nfs. the content of the .mpd.conf is secretword=1111
> >
> > I want to use the InfiniBand , so I installed OFED 1.2.5, I am not sure the InfiniBand is configured right, but the mpdcheck program between the two nodes works rightly.
> >
> > I will check the README_MPICH2 you mentioned.
> > Thanks again,
> >
> >
> > On Wed, 26 Dec 2007 11:22:19 -0500 (EST)
> > wei huang <huanwei at cse.ohio-state.edu> wrote:
> >
> > > Hi,
> > >
> > > Thanks for using mvapich2.
> > >
> > > > I installed mvapich2 , which is with the OFED 1.2.5.
> > >
> > > So do you use the default installation coming with the OFED package?
> > >
> > > > 1. when I use mpdboot on a machine, I got :
> > > > mpdboot_inode02 (handle_mpd_output 359): failed to ping mpd on inode02; recvd output={}
> > >
> > > There are multiple reasons which can cause this failure. But there are few
> > > things to check first:
> > >
> > > 1) Do you have other mpd running on the same set of nodes? (under the same
> > > user name)
> > >
> > > 2) Do you have .mpd.conf in your home directory?
> > >
> > > I also want to mention that we have already released mvapich2-1.0.1. You
> > > can try that by downloading the software package from our website:
> > >
> > > http://mvapich.cse.ohio-state.edu/
> > >
> > > There is a file called README_MPICH2 in the package. You can also read
> > > that for more details regarding set up mpd rings.
> > >
> > > Please let us know if this works.
> > >
> > > Thanks.
> > >
> > > -- Wei
> > >
> > > > 2. when I try to use mpd to set up mpd ring, as the user guide of mpich2:
> > > > mpd & on node02
> > > > mpd -h node02 -p port on node01
> > > > I got:
> > > > on node01: (the latter mpd)
> > > > inode01_33435 (connect_lhs 621): invalid challenge from inode02 32969: {}
> > > > inode01_33435 (enter_ring 566): lhs connect failed
> > > > inode01_33435 (run 233): failed to enter ring
> > > >
> > > > on node02: (the first mpd )
> > > >
> > > > inode02_32969: mpd_uncaught_except_tb handling:
> > > > exceptions.TypeError: sequence item 0: expected string, int found
> > > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpdlib.py 733 handle_ring_listener_connection
> > > > newsock.correctChallengeResponse = \
> > > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpdlib.py 488 handle_active_streams handler(stream,*args)
> > > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd 266 runmainloop
> > > > rv = self.streamHandler.handle_active_streams(timeout=8.0)
> > > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd 240 run
> > > > self.runmainloop()
> > > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd 1344 ?
> > > > mpd.run()
> > > >
> > > >
> > > >
> > > > Has anyone encountered this problem?
> > > > Thanks in advance.
> > > >
> > > >
> > > > _______________________________________________
> > > > mvapich-discuss mailing list
> > > > mvapich-discuss at cse.ohio-state.edu
> > > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > > >
> > >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
>
More information about the mvapich-discuss
mailing list