[mvapich-discuss] MVAPICH-0.9.8 lockup with OFED-1.1-rc3
Sayantan Sur
surs at cse.ohio-state.edu
Wed Sep 13 11:13:53 EDT 2006
Hello Andrew,
Andrew Dobbie wrote:
>The hosts file is identical for all machines except for the 127.0.0.1
>entry. Will this cause a problem? Also, is it wise to specify IPoIB
>addresses of machines instead of ethernet? I am confused as to why I
>only have problems with mvapich compiled with _SMP_ and not without.
>
>
Just for clarification, are you still on the 32-bit OS or on the 64-bit
one? IPoIB shouldn't make a difference. If you aren't launching very
very large number of processes, it won't likely make a difference which
one you use (IP over Ethernet or IP over IB), since the control traffic
in mpirun_rsh will be small.
As Pasha indicates, the problem could be with the hostfiles. MVAPICH
(the code under _SMP_) figures out which processes are on the same node
by doing comparing the hostnames.
>127.0.0.1 ND01 localhost.localdomain localhost
>
>
What happens if all the /etc/host files say:
127.0.0.1 localhost.localdomain localhost
(instead of having ND01 in the line)
Thanks,
Sayantan.
>192.168.1.80 FLSRVR
>192.168.8.1 ND01
>192.168.8.2 ND02
>192.168.8.3 ND03
>192.168.8.4 ND04
>192.168.8.5 ND05
>192.168.8.6 ND06
>
>
>On Wed, 2006-09-13 at 15:19 +0300, Pavel Shamis (Pasha) wrote:
>
>
>>Andrew Dobbie wrote:
>>
>>
>>>Yes. The application I use runs mpirun_rsh and has the same problem as
>>>the benchmarks. The hostnames I use point to IPoIB addresses of the
>>>machines and all hosts have the same entries in /etc/hosts. Specifying
>>>which hosts to use on command line or from -hostfile doesn't seem to
>>>matter.
>>>
>>>Does that answer the question you were asking?
>>>
>>>
>>>
>>Can you please provide your /etc/hosts file? The file
>>should be exactly the same on all machines, please check it.
>>
>>
>>
>
>
>_______________________________________________
>mvapich-discuss mailing list
>mvapich-discuss at mail.cse.ohio-state.edu
>http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
--
http://www.cse.ohio-state.edu/~surs
More information about the mvapich-discuss
mailing list