[mvapich-discuss] mvapich2 and openmp
Dominique DELANDE
Dominique.Delande at spectro.jussieu.fr
Thu May 31 04:01:26 EDT 2007
Ranjit Noronha wrote:
> Dominique,
>
> Can you try setting the following flag:
>
> export MV2_ENABLE_AFFINITY=0
>
> thanks,
> --ranjit
>
Ranjit,
Great, it works! The threads are now running in parallel and "top"
reports 400% activity (4 cores).
I looked at the documentation and could not find anything on the
relation between OpenMP and AFFINITY (I may have missed it).
Thanks a lot for your very useful help.
Dominique
>>> Thanks for sending the code. We built mvapich2-0.9.8p2 with
>>> --enable-threads=multiple. We modified your program a bit to measure the
>>> loop time. I have attached
>>> the modified version. The program was compiled with icc version 9.1 as
>>> follows:
>>>
>>> [bash: noronha at i2-2 /tmp/mvapich2-0.9.8p2/osu_benchmarks]$pwd
>>> /tmp/mvapich2-0.9.8p2/osu_benchmarks
>>>
>>> icc -O3 -openmp ./openmp.c -lmpich -lpthread -libverbs -libumad
>>> -I../src/include -L../lib -L /usr/local/ofed/lib64/
>>> ./openmp.c(42) : (col. 1) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.
>>>
>>> The program was run on a dual core four CPU Intel Clovertown (8 cores). We
>>> got the following results:
>>>
>>> Number of threads = 8 Loop time: 0.43 s
>>> Number of threads = 4 Loop time: 0.84 s
>>> Number of threads = 2 Loop time: 1.68 s
>>> Number of threads = 1 Loop time: 3.37 s
>>>
>>> This seems to indicate that there is no serialization happening. top
>>> shows that the cores are being utilized 100% when the loop is running.
>>> I have attached the complete trace of the output.
>>>
>>> What kind of system are you using? Is it an Intel or Opteron based system.
>>>
>>> thanks,
>>> --ranjit
>> Ranjit,
>>
>> I am using dual-core Opteron processors 285 (2.6 GHz), with two
>> processors per motherboard (Tyan Transport GT24 (B2891)).
>>
>> I ran the same code as you did and got the following results:
>> Number of threads = 8 Loop time: 1.92 s
>> Number of threads = 4 Loop time: 1.92 s
>> Number of threads = 2 Loop time: 1.92 s
>> Number of threads = 1 Loop time: 1.92 s
>>
>> And top shows that only 1/4 of the CPUs, that is one core only, is used....
>>
>> mvapich2-0.9.8p2 was built using the make.mvapich.ofa script with the
>> following variables set to:
>> export CC=icc
>> export FC=ifort
>> export F77=ifort
>> export CXX=icpc
>> export MULTI_THREAD=yes
>>
>> I attach the config.log, config-mine.log,make-mine.log and
>> install-mine.log files, as well as the stderr and stdout of the program.
>>
>> The OS is Fedora Core 5 (x86_64 SMP) and I tried both a Fedora Kernel
>> (2.6.20-1.2316.fc5) and a 2.6.21.3 from kernel.org, with the same
>> result. I use OFED-1.2-rc3, installed using the provided scripts.
>>
>> The program is compiled exactly as you did:
>> icc -O3 -openmp ./openmp.c -lmpich -lpthread -libverbs -libumad
>> -I/usr/local/mvapich2/include -L/usr/local/mvapich2/lib -L
>> /usr/local/ofed/lib64/
>>
--
Dominique Delande (Dominique.Delande at spectro.jussieu.fr)
Laboratoire Kastler-Brossel - Case 74 - Universite P. et M. Curie
4, place Jussieu, F-75252 Paris Cedex 05, FRANCE
Phone : 33 (0)1 44 27 27 97 - Fax : 33 (0)1 44 27 38 45
Acces : Pyramide de la Scolarite Paris VI - 1er etage - Bureau 214
More information about the mvapich-discuss
mailing list