[mvapich-discuss] mvapich can't run cross nodes when defining _SMP_

Abhinav Vishnu vishnu at cse.ohio-state.edu
Fri Jul 20 13:07:04 EDT 2007


Hi Terry,
> Dear Abhinav,
>
> In fact, I have tried the version 0.9.7, 0.9.8, 0.9.9 (release, branches,
> and trunk),
> and the script "make.mvapich.gen2", "make.mvapich.gen2_multirail".
> I still face the same problem which the program cannot run across nodes.
> When I undefine those two SMP options, every thing will work fine.
> I don't know what happened.
>

This is quite strange. All these MVAPICH versions have gone through
rigorous testing with a variety of MPI benchmarks and we did not see 
this problem.
>
> By the way, my main hardware configurations  are
> Intel Xeon E5335 x 2 x 2 nodes
> Fully-Buffered DDR2 667 2GB x 8 x 2 nodes.
> Could you give me more suggestions?
>
Thanks for providing information of your platform. Can you also provide
the following information:

1. The InfiniBand card you are using (Mellanox/Pathscale/?...) and the 
firmware
version you are using on the card. Also please let us know the switch 
version.

2. Any other information through /var/log/messages and dmesg?

Thanks much,

:- Abhinav

> Terry
>
> ----- Original Message ----- From: "Abhinav Vishnu" 
> <vishnu at cse.ohio-state.edu>
> To: "Terry" <terry.ccchang at gmail.com>
> Cc: <mvapich-discuss at cse.ohio-state.edu>
> Sent: Friday, July 20, 2007 8:52 PM
> Subject: Re: [mvapich-discuss] mvapich can't run cross nodes when 
> defining _SMP_
>
>
>> Hi Terry,
>>
>> Thanks for trying MVAPICH and reporting the problem.
>> We have tried this combination with MVAPICH 0.9.9 using Intel
>> MPI Benchmark and did not see this problem. Can you let us know
>> the MPI benchmark which you are using for trying out MVAPICH?
>>
>> There were some changes made to the multi-rail script, since MVAPICH 
>> 0.9.9
>> and these have been checked into the trunk and later version 
>> (0.9.9+psm). Can you try
>> this version and let us know if you still face the same problem?
>>
>> Thanks,
>>
>> :- Abhinav
>>
>>> My environment:
>>> CentOS 4.5
>>> Intel Compiler 10.0.023
>>> OFED 1.2
>>> MVAPICH 0.9.9
>>>  I build mvapich by using "make.mvapich.gen2_multirail",
>>> and modify the following lines:
>>> IBHOME=${IBHOME:-/usr/local/ofed}
>>> IBHOME_LIB=${IBHOME_LIB:-/usr/local/ofed/lib64}
>>> PREFIX=${PREFIX:-/usr/local/mvapich-0.9.9}
>>> export CC=${CC:-icc}
>>> export CXX=${CXX:-icpc}
>>> export F77=${F77:-ifort}
>>> If the CFLAGS contains "-D_SMP_ -D_SMP_RNDV_",
>>> mvapich can't run cross nodes (ex: n1 n2).
>>> In n1, mvapich will always be running.
>>> But in n2, mvapich is always in sleep state.
>>> However, it can run in local node (ex: n1 n1 or n2 n2) and execute 
>>> successfully.
>>>  If I undefine "-D_SMP_ -D_SMP_RNDV_",
>>> mvapich can run cross all nodes and execute successfully.
>>> Who can tell me why I can't use SMP options??
>>> Please help me to solve this problem. Thx.
>>>  ------------------------------------------------------------------------ 
>>>
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>



More information about the mvapich-discuss mailing list