[mvapich-discuss] question regarding threaded application

Alan Souza alan.avbs at rocketmail.com
Wed May 7 14:55:46 EDT 2014


Thanks for your reply Mr. Khaled. I used the following configuration parameters:

/configure --prefix=${HOME} --enable-shared --enable-rdma-cm --enable-threads=runtime --enable-romio --enable-hybrid --with-device=ch3:nemesis:ib,tcp --enable-registration-cache --with-ibverbs=/usr/lib64 --with-ibverbs-include=/usr/include/infiniband --with-ib-libpath=/usr/lib64 --with-ib-include=/usr/include/infiniband CC=$(which gcc) CXX=$(which g++) FC=$(which gfortran) F77=$(which gfortran)

I use a custom installation of gcc (version 4.8.2).

Using the mpich each instance use approximately 65% of each node's memory, but when I use the MVAPICH library this value increase up to 75%.

Its difficult to create an an example, but this is the general structure of the application:

Call a single threaded function (in this phase the extra thread is active)

Loop (this loop is parallelized using MPI)

    Call a single threaded function that call a multithreaded function (openblas)

    Call a explicit threaded function (using openMP)

Inside the loop the extra thread stays inactive

 
    

On Wednesday, 7 May 2014, 10:36, khaled hamidouche <hamidouc at cse.ohio-state.edu> wrote:
 
Hi Alan, 

Thanks for your note, we will take a look at this. In order to debug this locally can you please share with us :

1)  your Configure options 
2)  the amount of memory the mystery thread is taking in his setup
3)  your application (a reproducer) 

Thanks





On Tue, May 6, 2014 at 11:38 PM, Alan Souza <alan.avbs at rocketmail.com> wrote:

I have an application that make heavy use of some multi threaded 
libraries (openblas, the library was compiled to use OPENMP and with no 
thread binding) and umfpack. All communications are realized from 
the main thread (I already have tested some variations of 
MPI_Init_thread), B ut when I use the mvapich occurs some problems. On 
the cluster that I using each node has 8 cores and I am using only one 
MPI process per node.
>
>But
 when I was observing the process on some nodes I have noticed that was 
created 9 threads (one of them is keep active only on some serial parts 
and the others inactive. In the explicit threaded (using openMP) parts of the 
application eight threads are active and this one stays idle). The main 
problem with this to me is the increase of memory
 comsuption, because some allocations are done before the threeaded parts... . This behavior doesn't occur  neither using mpich nor 
OpenMPI.
>
>I use the following command to launch the jobs when using mvapich 2.0rc1:
>
>mpirun_rsh
 -np ${N} -hostfile host.txt MV2_ENABLE_AFFINITY=0 MV2_IBA_HCA=mlx4_0 
MV2_NUM_HCAS=1 OMP_NUM_THREADS=8 ./application parameters
>
>The hostfile contains the name of each node used and using the extended syntax :1 to indicate the number of processes per node.
>
>
>thanks
>
>
>_______________________________________________
>mvapich-discuss mailing list
>mvapich-discuss at cse.ohio-state.edu
>http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140507/6d648294/attachment.html>


More information about the mvapich-discuss mailing list