[mvapich-discuss] mvapich2 and openmp
Ranjit Noronha
noronha at cse.ohio-state.edu
Tue May 29 15:59:57 EDT 2007
Hi Dominique,
Thanks for sending the code. We built mvapich2-0.9.8p2 with --enable-threads=multiple. We modified your program a bit to measure the loop time. I have attached
the modified version. The program was compiled with icc version 9.1 as follows:
[bash: noronha at i2-2 /tmp/mvapich2-0.9.8p2/osu_benchmarks]$pwd
/tmp/mvapich2-0.9.8p2/osu_benchmarks
icc -O3 -openmp ./openmp.c -lmpich -lpthread -libverbs -libumad -I../src/include -L../lib -L /usr/local/ofed/lib64/
./openmp.c(42) : (col. 1) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.
The program was run on a dual core four CPU Intel Clovertown (8 cores). We got the
following results:
Number of threads = 8 Loop time: 0.43 s
Number of threads = 4 Loop time: 0.84 s
Number of threads = 2 Loop time: 1.68 s
Number of threads = 1 Loop time: 3.37 s
This seems to indicate that there is no serialization happening. top
shows that the cores are being utilized 100% when the loop is running.
I have attached the complete trace of the output.
What kind of system are you using? Is it an Intel or Opteron based system.
thanks,
--ranjit
> >
> >>I am trying to use mvapich2-0.9.8p2, built with
> >>--enable-threads=multiple, with a program where the various threads
> >>are created by OpenMP. When running the program, the threads
> >>are correctly created and executed, but always SEQUENTIALLY
> >>(each node is a two dual-core processors, and could thus accomodate
> >>4 threads). When threads are created manually (not with OpenMP),
> >>everything works fine. Any idea of what could be wrong?
> >>Each node runs a Fedora Core 5 distribution with a SMP 2.6.21.3
> >>linux kernel (similar behaviour observed for various kernels).
> >>I tried with gcc, version 4.2, and the Intel C compiler, version 9.1,
> >>with similar results.
> >>
> >
> >Can you let us know the application you are using? If possible can we get
> >a copy of this application.
> >
>
> The application is very simple and short, it is only for diagonising
> what the problem might be.
> Here it is:
>
>
> ---------------------------------------
> #include "mpi.h"
> #include <stdio.h>
> #include <math.h>
> #include <omp.h>
>
> int main( int argc, char *argv[])
> {
> int numprocs,myid;
> int namelen,provided;
> int n,nit,mythread;
> int i,it,j,k;
> double x,y;
> char processor_name[MPI_MAX_PROCESSOR_NAME];
> FILE *pipo;
> MPI_Init_thread(&argc,&argv,MPI_THREAD_MULTIPLE,&provided);
> fprintf(stderr,"%d %d %d %d
> %d\n",MPI_THREAD_SINGLE,MPI_THREAD_FUNNELED,MPI_THREAD_SERIALIZED,MPI_THREAD_MULTIPLE,provided);
> MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
> MPI_Comm_rank(MPI_COMM_WORLD,&myid);
> MPI_Get_processor_name(processor_name,&namelen);
> fprintf(stderr,"Process %d on %s\n",myid, processor_name);
> if (myid==0) {
> pipo=fopen("pipo","r");
> fscanf(pipo,"%d", &n);
> fscanf(pipo,"%d", &nit);
> fprintf(stderr,"%d %d\n",n,nit);
> }
> MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
> MPI_Bcast(&nit, 1, MPI_INT, 0, MPI_COMM_WORLD);
> MPI_Barrier(MPI_COMM_WORLD);
> x=0.0;
> y=1.5*(n+1.0);
> omp_set_num_threads(4);
> #pragma omp parallel for default(shared) private(it,mythread,x,i,j,k)
> for (it=1;it<=nit;it++) {
> for (i=1;i<=n;i++) {
> for (j=1;j<=n;j++) {
> for (k=1;k<=n;k++) {
> x+=i+j+k+it+myid-y;
> }
> }
> }
> x=x/(n*n*n);
> mythread=omp_get_thread_num();
> fprintf(stderr,"%d %d %d %lf\n",myid,mythread,it,x);
> }
> MPI_Finalize();
> return 0;
> }
> -------------------------------------
>
> As you can see, it is a simple loop which can be ran in parallel
> over the index "it".
> I also tried to set OMP_NUM_THREADS to 4 and commenting out the
> omp_set_num_threads(4);
> line, with the same result.
> The program is compiled with the "-openmp" option.
> The output of the first print is that the variable
> "provided" has value "3", that is MPI_THREAD_MULTIPLE, as expected.
> Configuring mvapich2 with --enable-threads=funneled for example,
> gives "provided" equal to "1", as expected, but not more parallelism.
>
>
> >To allow us to better diagnose the problem can you also give us some
> >details
> >about the OpenMP parallel directives you are using to create the threads
> >in your program? Did you set the environment variable OMP_NUM_THREADS to
> >4?
> >
> >
> >Also, what kind of operations are you using in the parallel
> >sections? Do you have any critical sections or locks in the parallel
> >section
> >that might be serializing things?
> >
> >A code snippet of the OpenMP parallel sections as well the version where
> >threads are created manually will be helpful.
> >
-------------- next part --------------
#include "mpi.h"
#include <stdio.h>
#include <math.h>
#include <omp.h>
int main( int argc, char *argv[])
{
int ki;
int numprocs,myid=0;
int namelen,provided;
int n,nit,mythread;
int i,it,j,k;
double x,y;
double latency,t_start,t_end;
char processor_name[MPI_MAX_PROCESSOR_NAME];
FILE *pipo;
MPI_Init_thread(&argc,&argv,MPI_THREAD_MULTIPLE, &provided);
fprintf(stderr,"%d %d %d %d %d\n",MPI_THREAD_SINGLE,MPI_THREAD_FUNNELED,MPI_THREAD_SERIALIZED,MPI_THREAD_MULTIPLE,provided);
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
MPI_Comm_rank(MPI_COMM_WORLD,&myid);
MPI_Get_processor_name(processor_name,&namelen);
fprintf(stderr,"Process %d on %s\n",myid, processor_name);
if (myid==0) {
#if 0
pipo=fopen("pipo","r");
fscanf(pipo,"%d", &n);
fscanf(pipo,"%d", &nit);
#endif
n=100;nit=1000;
fprintf(stderr,"%d %d\n",n,nit);
}
MPI_Bcast(&n, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Bcast(&nit, 1, MPI_INT, 0, MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
x=0.0;
y=1.5*(n+1.0);
for(ki=8;ki>=1;ki=ki/2){
omp_set_num_threads(ki);
if(myid==0) t_start = MPI_Wtime();
#pragma omp parallel for default(shared) private(it,mythread,x,i,j,k)
for (it=1;it<=nit;it++) {
for (i=1;i<=n;i++) {
for (j=1;j<=n;j++) {
for (k=1;k<=n;k++) {
x+=i+j+k+it+myid-y;
}
}
}
x=x/(n*n*n);
mythread=omp_get_thread_num();
fprintf(stderr,"%d %d %d %lf\n",myid,mythread,it,x);
}
if (myid==0) {
t_end = MPI_Wtime();
latency = (t_end - t_start);
fprintf(stdout, "Number of threads = %d Loop time: %0.2f s \n",ki, latency);
}
}
MPI_Finalize();
return 0;
}
-------------- next part --------------
A non-text attachment was scrubbed...
Name: openmp.out.gz
Type: application/x-gzip
Size: 15748 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070529/fd3519f2/openmp.out-0001.bin
More information about the mvapich-discuss
mailing list