[mvapich-discuss] mvapich2 and openmp

Dominique DELANDE Dominique.Delande at spectro.jussieu.fr
Thu May 31 04:01:26 EDT 2007


Ranjit Noronha wrote:
> Dominique,
> 
> Can you try setting the following flag:
> 
> export MV2_ENABLE_AFFINITY=0
> 
> thanks,
> --ranjit
> 

	Ranjit,

Great, it works! The threads are now running in parallel and "top"
reports 400% activity (4 cores).

I looked at the documentation and could not find anything on the 
relation between OpenMP and AFFINITY (I may have missed it).

Thanks a lot for your very useful help.

Dominique

>>> Thanks for sending the code. We built mvapich2-0.9.8p2 with 
>>> --enable-threads=multiple. We modified your program a bit to measure the 
>>> loop time. I have attached
>>> the modified version. The program was compiled with icc version 9.1 as 
>>> follows:
>>>
>>> [bash: noronha at i2-2 /tmp/mvapich2-0.9.8p2/osu_benchmarks]$pwd
>>> /tmp/mvapich2-0.9.8p2/osu_benchmarks
>>>
>>> icc -O3   -openmp ./openmp.c -lmpich -lpthread -libverbs -libumad 
>>> -I../src/include -L../lib -L /usr/local/ofed/lib64/
>>> ./openmp.c(42) : (col. 1) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.
>>>
>>> The program was run on a dual core four CPU Intel Clovertown (8 cores). We 
>>> got the following results:
>>>
>>> Number of threads = 8 Loop time: 0.43 s
>>> Number of threads = 4 Loop time: 0.84 s
>>> Number of threads = 2 Loop time: 1.68 s
>>> Number of threads = 1 Loop time: 3.37 s
>>>
>>> This seems to indicate that there is no serialization happening. top
>>> shows that the cores are being utilized 100% when the loop is running. 
>>> I have attached the complete trace of the output.
>>>
>>> What kind of system are you using? Is it an Intel or Opteron based system.
>>>
>>> thanks,
>>> --ranjit
>> 	Ranjit,
>>
>> I am using dual-core Opteron processors 285 (2.6 GHz), with two 
>> processors per motherboard (Tyan Transport GT24 (B2891)).
>>
>> I ran the same code as you did and got the following results:
>> Number of threads = 8 Loop time: 1.92 s
>> Number of threads = 4 Loop time: 1.92 s
>> Number of threads = 2 Loop time: 1.92 s
>> Number of threads = 1 Loop time: 1.92 s
>>
>> And top shows that only 1/4 of the CPUs, that is one core only, is used....
>>
>> mvapich2-0.9.8p2 was built using the make.mvapich.ofa script with the 
>> following variables set to:
>> export CC=icc
>> export FC=ifort
>> export F77=ifort
>> export CXX=icpc
>> export MULTI_THREAD=yes
>>
>> I attach the config.log, config-mine.log,make-mine.log and 
>> install-mine.log files, as well as the stderr and stdout of the program.
>>
>> The OS is Fedora Core 5 (x86_64 SMP) and I tried both a Fedora Kernel 
>> (2.6.20-1.2316.fc5) and a 2.6.21.3 from kernel.org, with the same 
>> result. I use OFED-1.2-rc3, installed using the provided scripts.
>>
>> The program is compiled exactly as you did:
>> icc -O3   -openmp ./openmp.c -lmpich -lpthread -libverbs -libumad 
>> -I/usr/local/mvapich2/include -L/usr/local/mvapich2/lib -L 
>> /usr/local/ofed/lib64/
>>



-- 
    Dominique Delande (Dominique.Delande at spectro.jussieu.fr)
    Laboratoire Kastler-Brossel - Case 74 - Universite P. et M. Curie
    4, place Jussieu, F-75252 Paris Cedex 05, FRANCE
    Phone : 33 (0)1 44 27 27 97 - Fax : 33 (0)1 44 27 38 45
    Acces : Pyramide de la Scolarite Paris VI - 1er etage - Bureau 214


More information about the mvapich-discuss mailing list