[mvapich-discuss] mvapich2 and openmp

Dominique DELANDE Dominique.Delande at spectro.jussieu.fr
Tue May 29 09:32:50 EDT 2007


Ranjit Noronha wrote:
> Hi Dominique,
> 
>> I am trying to use mvapich2-0.9.8p2, built with 
>> --enable-threads=multiple, with a program where the various threads
>> are created by OpenMP. When running the program, the threads
>> are correctly created and executed, but always SEQUENTIALLY
>> (each node is a two dual-core processors, and could thus accomodate
>> 4 threads). When threads are created manually (not with OpenMP),
>> everything works fine. Any idea of what could be wrong?
>> Each node runs a Fedora Core 5 distribution with a SMP 2.6.21.3
>> linux kernel (similar behaviour observed for various kernels).
>> I tried with gcc, version 4.2, and the Intel C compiler, version 9.1,
>> with similar results.
>>
> 
> Can you let us know the application you are using? If possible can we get 
> a copy of this application. 
> 

The application is very simple and short, it is only for diagonising
what the problem might be.
Here it is:


---------------------------------------
#include "mpi.h"
#include <stdio.h>
#include <math.h>
#include <omp.h>

int main( int argc, char *argv[])
{
   int numprocs,myid;
   int  namelen,provided;
   int n,nit,mythread;
   int i,it,j,k;
   double x,y;
   char processor_name[MPI_MAX_PROCESSOR_NAME];
   FILE *pipo;
   MPI_Init_thread(&argc,&argv,MPI_THREAD_MULTIPLE,&provided);
   fprintf(stderr,"%d %d %d %d 
%d\n",MPI_THREAD_SINGLE,MPI_THREAD_FUNNELED,MPI_THREAD_SERIALIZED,MPI_THREAD_MULTIPLE,provided);
   MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
   MPI_Comm_rank(MPI_COMM_WORLD,&myid);
   MPI_Get_processor_name(processor_name,&namelen);
   fprintf(stderr,"Process %d on %s\n",myid, processor_name);
   if (myid==0) {
     pipo=fopen("pipo","r");
     fscanf(pipo,"%d", &n);
     fscanf(pipo,"%d", &nit);
     fprintf(stderr,"%d %d\n",n,nit);
   }
   MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
   MPI_Bcast(&nit, 1, MPI_INT, 0, MPI_COMM_WORLD);
   MPI_Barrier(MPI_COMM_WORLD);
   x=0.0;
   y=1.5*(n+1.0);
   omp_set_num_threads(4);
#pragma omp parallel for default(shared) private(it,mythread,x,i,j,k)
   for (it=1;it<=nit;it++) {
     for (i=1;i<=n;i++) {
       for (j=1;j<=n;j++) {
         for (k=1;k<=n;k++) {
           x+=i+j+k+it+myid-y;
         }
       }
     }
     x=x/(n*n*n);
     mythread=omp_get_thread_num();
     fprintf(stderr,"%d %d %d %lf\n",myid,mythread,it,x);
   }
   MPI_Finalize();
   return 0;
}
-------------------------------------

As you can see, it is a simple loop which can be ran in parallel
over the index "it".
I also tried to set OMP_NUM_THREADS to 4 and commenting out the
   omp_set_num_threads(4);
line, with the same result.
The program is compiled with the "-openmp" option.
The output of the first print is that the variable
"provided" has value "3", that is MPI_THREAD_MULTIPLE, as expected.
Configuring mvapich2 with --enable-threads=funneled for example,
gives "provided" equal to "1", as expected, but not more parallelism.


> To allow us to better diagnose the problem can you also give us some details
> about the OpenMP parallel directives you are using to create the threads 
> in your program?  Did you set the environment variable OMP_NUM_THREADS to
> 4?
> 
> 
> Also, what kind of operations are you using in the parallel
> sections? Do you have any critical sections or locks in the parallel section
> that might be serializing things?
> 
> A code snippet of the OpenMP parallel sections as well the version where 
> threads are created manually will be helpful.
> 
> thanks,
> --ranjit
> 

Thanks a lot for your help

Dominique


More information about the mvapich-discuss mailing list