[mvapich-discuss] question regarding threaded application
Alan Souza
alan.avbs at rocketmail.com
Wed May 7 14:55:46 EDT 2014
Thanks for your reply Mr. Khaled. I used the following configuration parameters:
/configure --prefix=${HOME} --enable-shared --enable-rdma-cm --enable-threads=runtime --enable-romio --enable-hybrid --with-device=ch3:nemesis:ib,tcp --enable-registration-cache --with-ibverbs=/usr/lib64 --with-ibverbs-include=/usr/include/infiniband --with-ib-libpath=/usr/lib64 --with-ib-include=/usr/include/infiniband CC=$(which gcc) CXX=$(which g++) FC=$(which gfortran) F77=$(which gfortran)
I use a custom installation of gcc (version 4.8.2).
Using the mpich each instance use approximately 65% of each node's memory, but when I use the MVAPICH library this value increase up to 75%.
Its difficult to create an an example, but this is the general structure of the application:
Call a single threaded function (in this phase the extra thread is active)
Loop (this loop is parallelized using MPI)
Call a single threaded function that call a multithreaded function (openblas)
Call a explicit threaded function (using openMP)
Inside the loop the extra thread stays inactive
On Wednesday, 7 May 2014, 10:36, khaled hamidouche <hamidouc at cse.ohio-state.edu> wrote:
Hi Alan,
Thanks for your note, we will take a look at this. In order to debug this locally can you please share with us :
1) your Configure options
2) the amount of memory the mystery thread is taking in his setup
3) your application (a reproducer)
Thanks
On Tue, May 6, 2014 at 11:38 PM, Alan Souza <alan.avbs at rocketmail.com> wrote:
I have an application that make heavy use of some multi threaded
libraries (openblas, the library was compiled to use OPENMP and with no
thread binding) and umfpack. All communications are realized from
the main thread (I already have tested some variations of
MPI_Init_thread), B ut when I use the mvapich occurs some problems. On
the cluster that I using each node has 8 cores and I am using only one
MPI process per node.
>
>But
when I was observing the process on some nodes I have noticed that was
created 9 threads (one of them is keep active only on some serial parts
and the others inactive. In the explicit threaded (using openMP) parts of the
application eight threads are active and this one stays idle). The main
problem with this to me is the increase of memory
comsuption, because some allocations are done before the threeaded parts... . This behavior doesn't occur neither using mpich nor
OpenMPI.
>
>I use the following command to launch the jobs when using mvapich 2.0rc1:
>
>mpirun_rsh
-np ${N} -hostfile host.txt MV2_ENABLE_AFFINITY=0 MV2_IBA_HCA=mlx4_0
MV2_NUM_HCAS=1 OMP_NUM_THREADS=8 ./application parameters
>
>The hostfile contains the name of each node used and using the extended syntax :1 to indicate the number of processes per node.
>
>
>thanks
>
>
>_______________________________________________
>mvapich-discuss mailing list
>mvapich-discuss at cse.ohio-state.edu
>http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140507/6d648294/attachment.html>
More information about the mvapich-discuss
mailing list