[mvapich-discuss] MVAPICH-0.9.8 lockup with OFED-1.1-rc3

Sayantan Sur surs at cse.ohio-state.edu
Wed Sep 13 11:13:53 EDT 2006


Hello Andrew,

Andrew Dobbie wrote:

>The hosts file is identical for all machines except for the 127.0.0.1
>entry.  Will this cause a problem?  Also, is it wise to specify IPoIB
>addresses of machines instead of ethernet?  I am confused as to why I
>only have problems with mvapich compiled with _SMP_ and not without.
>  
>
Just for clarification, are you still on the 32-bit OS or on the 64-bit 
one? IPoIB shouldn't make a difference. If you aren't launching very 
very large number of processes, it won't likely make a difference which 
one you use (IP over Ethernet or IP over IB), since the control traffic 
in mpirun_rsh will be small.

As Pasha indicates, the problem could be with the hostfiles. MVAPICH 
(the code under _SMP_) figures out which processes are on the same node 
by doing comparing the hostnames.

>127.0.0.1       ND01    localhost.localdomain   localhost
>  
>
What happens if all the /etc/host files say:

127.0.0.1      localhost.localdomain localhost

(instead of having ND01 in the line)

Thanks,
Sayantan.

>192.168.1.80    FLSRVR
>192.168.8.1     ND01
>192.168.8.2     ND02
>192.168.8.3     ND03
>192.168.8.4     ND04
>192.168.8.5     ND05
>192.168.8.6     ND06
>  
>

>On Wed, 2006-09-13 at 15:19 +0300, Pavel Shamis (Pasha) wrote:
>  
>
>>Andrew Dobbie wrote:
>>    
>>
>>>Yes.  The application I use runs mpirun_rsh and has the same problem as
>>>the benchmarks.  The hostnames I use point to IPoIB addresses of the
>>>machines and all hosts have the same entries in /etc/hosts.  Specifying
>>>which hosts to use on command line or from -hostfile doesn't seem to
>>>matter.
>>>
>>>Does that answer the question you were asking?
>>>
>>>      
>>>
>>Can you please provide your /etc/hosts file? The file
>>should be exactly the same on all machines, please check it.
>>
>>    
>>
>
>
>_______________________________________________
>mvapich-discuss mailing list
>mvapich-discuss at mail.cse.ohio-state.edu
>http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>  
>


-- 
http://www.cse.ohio-state.edu/~surs



More information about the mvapich-discuss mailing list