[mvapich-discuss] Re: mvapich2 and Intel MPI Benchmarks

Dhabaleswar Panda panda at cse.ohio-state.edu
Sat Sep 19 07:28:44 EDT 2009


Bryan,

Thanks for your postings. We have been running IMB 3.2 without any
problem. When configuring mvapich2 library, are you disabling shared
memory communication? Are you seeing this problem with `mpiexec' or with
`mpirun_rsh' also. I will suggest you to start using the `mpirun_rsh'
framework for scalable job-launching.

Thanks,

DK

> > Hi All,
> >
> > I just wondered if you'd had a chance to try out the Intel MPI Benchmarks
> > with mvapich2. IMB-3.2 is what I'm currently running.
> >
> > We've had no problems with the latest mpich2-1.1.1p1, however when using
> > mvapich2-1.4rc2 or mvapich2-1.4rc1 (on both our IB and iWARP clusters)
> > the benchmarks are failing, for example even a simple
> >
> > mpiexec -np 8 ./IMB-MPI1
> >
> > will result in errors such as
>
> I've discovered that these errors don't occur if each of the 8 processors
> is on a separate node. In fact if I disable "shared memory collectives"
> by setting the variable
>
> MV2_USE_SHMEM_COLL=0
>
> then the Intel MPI Benchmark seems to be happy even when the processors
> are on a single node, and things work fine on both the IB and iWARP
> clusters.
>
> Bryan
>
> >
> >
> > rank 6 in job 1  coates-a279.rcac.purdue.edu_51100   caused collective
> > abort of all ranks
> >   exit status of rank 6: killed by signal 9
> > rank 5 in job 1  coates-a279.rcac.purdue.edu_51100   caused collective
> > abort of all ranks
> >   exit status of rank 5: killed by signal 11
> > rank 4 in job 1  coates-a279.rcac.purdue.edu_51100   caused collective
> > abort of all ranks
> >   exit status of rank 4: killed by signal 11
> > rank 3 in job 1  coates-a279.rcac.purdue.edu_51100   caused collective
> > abort of all ranks
> >   exit status of rank 3: killed by signal 11
> > rank 2 in job 1  coates-a279.rcac.purdue.edu_51100   caused collective
> > abort of all ranks
> >   exit status of rank 2: kille
> >
> >
> >
> > Thanks,
> > Bryan
> >
> > --
> > Bryan Putnam
> > Rosen Center for Advanced Computing, Purdue University
> > Young Hall (Rm. 519)
> > 302 Wood Street
> > West Lafayette, IN 47907-2108
> > Ph 765-496-8225 Fax 765-494-0566
> > bfp at purdue.edu
> > http://www.rcac.purdue.edu
> >
> >
> >
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list