[mvapich-discuss] MVAPICH + MKL + ~100 lines of code = crash

Dhabaleswar Panda panda at cse.ohio-state.edu
Sat Oct 10 20:32:49 EDT 2009


Good to know that the 1.1 branch version fixes the problem. Let us know
the outcome once you have done more extensive experiments.

Thanks,

DK

On Sat, 10 Oct 2009, Antonio Molins wrote:

> Thanks so much, DK!
>
> I was actually using MPICH 1.1, the last version available in the
> webpage, not the nightly build. I have been trying the one you sent
> me, and seems to fix the problem. Will tell you later once I try it
> more thoroughly.
>
> Best,
> A
>
> On Oct 9, 2009, at 9:16 PM, Dhabaleswar Panda wrote:
>
> > Thanks for the report. Your note indicates that you are using
> > MVAPICH 1.0
> > version.  This is a very old version (released during February '08).
> > Can
> > you check whether this happens with the latest MVAPICH 1.1 version.
> > Please
> > use the branch version obtained from the following URL. This has all
> > the
> > bug-fixes since 1.1 release.
> >
> > http://mvapich.cse.ohio-state.edu/nightly/mvapich/branches/1.1/
> >
> > In the Changelog
> > (http://mvapich.cse.ohio-state.edu/download/mvapich/changes.shtml), I
> > see that a fix related to segmentation fault while using BLACS was
> > done
> > for 1.0.1 version.
> >
> > Let us know whether the problem persists with this latest 1.1 branch
> > version and we will take a look at this in-depth.
> >
> > Thanks,
> >
> > DK
> >
> > On Fri, 9 Oct 2009, Antonio Molins wrote:
> >
> >> Hi everybody,
> >>
> >> I am new to the list, and have been programming in MPI for a year
> >> or so. I am running into what I think is a bug of either MVAPICH or
> >> MKL that is totally driving me nuts :(
> >>
> >> The following code has proved good to generate a crash using MKL
> >> 10.2 update 2 (sequential version and threaded), last revision of
> >> MVAPICH, in two different clusters. Can anybody tell me what the
> >> problem is here? It does not crash always, but it does crash when
> >> the right number of MPI processes and matrix sizes are selected.
> >> Sometimes it wouldn't crash but sloooooow down several orders of
> >> magnitude, and note that the actual data being processed is exactly
> >> the same.
> >>
> >> A
> >>
> >> /*
> >> *  crash.cpp - crashes with ICC 11.1, MKL 10.2, MVAPICH 1.0 on
> >> linux 64-bit
> >> * both linked with the serial or threaded libraries
> >> * doing mpirun -np 36 crash 5000 10
> >> */
> >>
> >>
> >>
> >> #include <stdio.h>
> >> #include <stdlib.h>
> >> #include <string.h>
> >> #include <math.h>
> >> #include "mpi.h"
> >> #include "mkl_scalapack.h"
> >>
> >>
> >>
> >> extern "C" {
> >> /* BLACS C interface */
> >> void Cblacs_get( int context, int request, int* value);
> >> int Cblacs_gridinit( int* context, char * order, int np_row, int
> >> np_col);
> >> void Cblacs_gridinfo( int context, int*  np_row, int* np_col, int*
> >> my_row, int*  my_col);
> >> int numroc_( int *n, int *nb, int *iproc, int *isrcproc, int
> >> *nprocs);
> >> /* PBLAS */
> >> void pdgemm_( char *TRANSA, char *TRANSB, int * M, int * N, int *
> >> K, double * ALPHA,
> >> double * A, int * IA, int * JA, int * DESCA, double * B, int * IB,
> >> int * JB, int * DESCB,
> >> double * BETA, double * C, int * IC, int * JC, int * DESCC );
> >> }
> >>
> >>
> >>
> >> #define BLOCK_SIZE 65
> >>
> >>
> >>
> >> int main( int argc, char* argv[] )
> >> {
> >> int iam, nprocs;
> >> MPI_Init(&argc,&argv);       /* starts MPI */
> >> MPI_Comm_rank(MPI_COMM_WORLD, &iam);
> >> MPI_Comm_size(MPI_COMM_WORLD, &nprocs);
> >>
> >>
> >> // get done with the ones that are not part of the grid
> >> int blacs_pgrid_size = floor(sqrt(nprocs));
> >> if (iam>=blacs_pgrid_size*blacs_pgrid_size) {
> >> printf("Bye bye world from process %d of %d. BLACS had no place for
> >> me...\n",iam,nprocs);
> >> MPI_Finalize();
> >> }
> >>
> >>
> >> // start BLACS with square processor grid
> >> if(iam==0)
> >> printf("starting BLACS...");
> >> int ictxt,nprow,npcol,myrow,mycol;
> >> Cblacs_get( -1, 0, &ictxt );
> >> Cblacs_gridinit( &ictxt, "C", blacs_pgrid_size, blacs_pgrid_size );
> >> Cblacs_gridinfo( ictxt, &nprow, &npcol, &myrow, &mycol );
> >> if(iam==0)
> >> printf("done.\n");
> >>
> >>
> >> double timing;
> >> int m,n,k,lm,ln,nbm,nbn,rounds;
> >> int myzero=0,myone=1;
> >> sscanf(argv[1],"%d",&m);
> >> n=m;
> >> k=m;
> >> sscanf(argv[2],"%d",&rounds);
> >>
> >>
> >>
> >> nbm = BLOCK_SIZE;
> >> nbn = BLOCK_SIZE;
> >> lm = numroc_(&m, &nbm, &myrow, &myzero, &nprow);
> >> ln = numroc_(&n, &nbn, &mycol, &myzero, &npcol);
> >>
> >>
> >> int info;
> >> int *ipiv = new int[lm+nbm+10000000]; //adding a "little" bit of
> >> extra space just in case
> >> char ta = 'N',tb = 'T';
> >> double alpha = 1.0, beta = 0.0;
> >>
> >>
> >>
> >> double* test1data = new double[lm*ln];
> >> double* test2data = new double[lm*ln];
> >> double* test3data = new double[lm*ln];
> >>
> >>
> >> for(int i=0;i<lm*ln;i++)
> >> test1data[i]=(double)(rand()%100)/10000.0;
> >>
> >>
> >> int *test1desc  = new int[9];
> >> int *test2desc  = new int[9];
> >> int *test3desc  = new int[9];
> >>
> >>
> >> test1desc[0] = 1; // descriptor type
> >> test1desc[1] = ictxt; // blacs context
> >> test1desc[2] = m; // global number of rows
> >> test1desc[3] = n; // global number of columns
> >> test1desc[4] = nbm; // row block size
> >> test1desc[5] = nbn; // column block size (DEFINED EQUAL THAN ROW
> >> BLOCK SIZE)
> >> test1desc[6] = 0; // initial process row(DEFINED 0)
> >> test1desc[7] = 0; // initial process column (DEFINED 0)
> >> test1desc[8] = lm; // leading dimension of local array
> >>
> >>
> >> memcpy(test2desc,test1desc,9*sizeof(int));
> >> memcpy(test3desc,test1desc,9*sizeof(int));
> >>
> >>
> >>
> >> for(int iter=0;iter<rounds;iter++)
> >> {
> >> if(iam==0)
> >> printf("iter %i - ",iter);
> >> //test2 = test1
> >> memcpy(test2data,test1data,lm*ln*sizeof(double));
> >> //test3 = test1*test2
> >> timing=MPI_Wtime();
> >> pdgemm_(&ta,&tb,&m,&n,&k,
> >> &alpha,
> >> test1data,&myone,&myone,test1desc,
> >> test2data,&myone,&myone, test2desc,
> >> &beta,
> >> test3data,&myone,&myone, test3desc);
> >> if(iam==0)
> >> printf(" PDGEMM = %f |",MPI_Wtime()-timing);
> >> //test3 = LU(test3)
> >> timing=MPI_Wtime();
> >> pdgetrf_(&m, &n, test3data, &myone, &myone, test3desc, ipiv, &info);
> >> if(iam==0)
> >> printf(" PDGETRF = %f.\n",MPI_Wtime()-timing);
> >> }
> >> delete[] ipiv;
> >> delete[] test1data, test2data, test3data;
> >> delete[] test1desc, test2desc, test3desc;
> >>
> >>
> >> MPI_Finalize();
> >> return 0;
> >> }
> >>
> >> --------------------------------------------------------------------------------
> >> Antonio Molins, PhD Candidate
> >> Medical Engineering and Medical Physics
> >> Harvard - MIT Division of Health Sciences and Technology
> >> --
> >> "Y así del poco dormir y del mucho leer,
> >> se le secó el cerebro de manera que vino
> >> a perder el juicio".
> >>                                       Miguel de Cervantes
> >> --------------------------------------------------------------------------------
> >>
> >>
> >>
> >>
> >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> --------------------------------------------------------------------------------
> Antonio Molins, PhD Candidate
> Medical Engineering and Medical Physics
> Harvard - MIT Division of Health Sciences and Technology
> --
> "Y así del poco dormir y del mucho leer,
> se le secó el cerebro de manera que vino
> a perder el juicio".
>                                         Miguel de Cervantes
> --------------------------------------------------------------------------------
>
>
>
>




More information about the mvapich-discuss mailing list