[mvapich-discuss] can't set up mpd ring between two nodes (fwd)
Matthew Koop
koop at cse.ohio-state.edu
Thu Dec 27 11:01:56 EST 2007
Can you try changing your secret word in .mpd.conf to start with a letter
instead of a digit? (i.e. 'c1111' instead of '1111').
Thanks,
Matt
On Thu, 27 Dec 2007, jetspeed wrote:
> Hi, huanwei:
> Thanks for your reply.
> 1. There is no other mpd running on the same node, I mpdallexit before I start mpd.
> 2. there is .mpd.conf in the home directory, the two nodes share the $HOME by nfs. the content of the .mpd.conf is secretword=1111
>
> I want to use the InfiniBand , so I installed OFED 1.2.5, I am not sure the InfiniBand is configured right, but the mpdcheck program between the two nodes works rightly.
>
> I will check the README_MPICH2 you mentioned.
> Thanks again,
>
>
> On Wed, 26 Dec 2007 11:22:19 -0500 (EST)
> wei huang <huanwei at cse.ohio-state.edu> wrote:
>
> > Hi,
> >
> > Thanks for using mvapich2.
> >
> > > I installed mvapich2 , which is with the OFED 1.2.5.
> >
> > So do you use the default installation coming with the OFED package?
> >
> > > 1. when I use mpdboot on a machine, I got :
> > > mpdboot_inode02 (handle_mpd_output 359): failed to ping mpd on inode02; recvd output={}
> >
> > There are multiple reasons which can cause this failure. But there are few
> > things to check first:
> >
> > 1) Do you have other mpd running on the same set of nodes? (under the same
> > user name)
> >
> > 2) Do you have .mpd.conf in your home directory?
> >
> > I also want to mention that we have already released mvapich2-1.0.1. You
> > can try that by downloading the software package from our website:
> >
> > http://mvapich.cse.ohio-state.edu/
> >
> > There is a file called README_MPICH2 in the package. You can also read
> > that for more details regarding set up mpd rings.
> >
> > Please let us know if this works.
> >
> > Thanks.
> >
> > -- Wei
> >
> > > 2. when I try to use mpd to set up mpd ring, as the user guide of mpich2:
> > > mpd & on node02
> > > mpd -h node02 -p port on node01
> > > I got:
> > > on node01: (the latter mpd)
> > > inode01_33435 (connect_lhs 621): invalid challenge from inode02 32969: {}
> > > inode01_33435 (enter_ring 566): lhs connect failed
> > > inode01_33435 (run 233): failed to enter ring
> > >
> > > on node02: (the first mpd )
> > >
> > > inode02_32969: mpd_uncaught_except_tb handling:
> > > exceptions.TypeError: sequence item 0: expected string, int found
> > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpdlib.py 733 handle_ring_listener_connection
> > > newsock.correctChallengeResponse = \
> > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpdlib.py 488 handle_active_streams handler(stream,*args)
> > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd 266 runmainloop
> > > rv = self.streamHandler.handle_active_streams(timeout=8.0)
> > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd 240 run
> > > self.runmainloop()
> > > /usr/mpi/gcc/mvapich2-0.9.8-15/bin/mpd 1344 ?
> > > mpd.run()
> > >
> > >
> > >
> > > Has anyone encountered this problem?
> > > Thanks in advance.
> > >
> > >
> > > _______________________________________________
> > > mvapich-discuss mailing list
> > > mvapich-discuss at cse.ohio-state.edu
> > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > >
> >
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list