[mvapich-discuss] mvapich2-0.9.8 blacs problems

amith rajith mamidala mamidala at cse.ohio-state.edu
Mon Mar 19 22:06:46 EDT 2007


Hi Bas,

Can you please apply the one line patches below and let us know the
outcome? I have tried a couple of cases and the patch is working fine.
Also, can you let us know the nature of this application (scal.f).
It seems to be using several hundred of MPI_Comm_split operations. Is this
the typical application pattern?

For mvapich-0.9.9-beta:

Index: create_2level_comm.c (In $HOME/src/context)
===================================================================
--- create_2level_comm.c        (revision 1102)
+++ create_2level_comm.c        (working copy)
@@ -56,7 +56,6 @@
     struct MPIR_COMMUNICATOR* comm_world_ptr;
     comm_world_ptr = MPIR_GET_COMM_PTR(MPI_COMM_WORLD);

-    if (comm_count > MAX_ALLOWED_COMM) return;

     int* shmem_group = malloc(sizeof(int) * size);
     if (NULL == shmem_group){



For mvapich2-0.9.8:

Index: create_2level_comm.c (In $HOME/src/mpi/comm)
===================================================================
--- create_2level_comm.c        (revision 1104)
+++ create_2level_comm.c        (working copy)
@@ -33,7 +33,6 @@
     MPID_Comm_get_ptr( comm, comm_ptr );
     MPID_Comm_get_ptr( MPI_COMM_WORLD, comm_world_ptr );

-    if (comm_count > MAX_ALLOWED_COMM) return;

     MPIR_Nest_incr();



Thanks,
Amith


On Mon, 19 Mar 2007, Bas van der Vlies wrote:

> Dhabaleswar Panda wrote:
> > Hi Bas,
> >
> >>   We have done some further testing with mvapich version:
> >>    * 0.9.8 everything works
> >>
> >>    * 0.9.9-beta it very slow and it hangs also like mvapich2
> >
> > Thanks for reporting this. Just to check ... are you using the latest
> > mvapich 0.9.9 from the trunk or the beta tarball released on 02/09/07.
> > A lot of successive fixes and tunings have gone since the beta version
> > was released. You can get the latest version of the trunk through SVN
> > checkout or downloading the nightly tarballs of the trunk.
> >
> I downloaded the tarball. We will test the latest trunk version.
>
> Regards
>
> > Best Regards,
> >
> > DK
> >
> >> Regards
> >>
> >>> Thanks.
> >>>
> >>> Regards,
> >>> Wei Huang
> >>>
> >>> 774 Dreese Lab, 2015 Neil Ave,
> >>> Dept. of Computer Science and Engineering
> >>> Ohio State University
> >>> OH 43210
> >>> Tel: (614)292-8501
> >>>
> >>>
> >>> On Mon, 19 Mar 2007, Bas van der Vlies wrote:
> >>>
> >>>> wei huang wrote:
> >>>>> Hi,
> >>>>>
> >>>>> Thanks for letting us know the problem. We have generated a patch to
> >>>>> address this problem, and have applied it to both the trunk and our svn
> >>>>> 0.9.8 branch.
> >>>>>
> >>>>>
> >>>> We have done some more tests and found some other problem using mvapich2
> >>>> and blacs. This problems are encountered by user programs. We get
> >>>> reports from our users that they get wrong answers from their programs.
> >>>>
> >>>> We have made a small fortran (g77) to illustrate a problem.
> >>>> The calls a number of times the same scalapack routine. Independent of
> >>>> the size of the problem the program hangs after 8 or 31 iterations
> >>>> except when number of processes is a square, eg 1x1, 2x2, ...
> >>>>
> >>>> How to compile the program:
> >>>> mpif77 -Wall -g -O0 -o scal scal.f -lscalapack -lfblacs -lcblacs -lblacs
> >>>> -llapack -latlas
> >>>>
> >>>> The program expects on standard input:
> >>>> <size of matrix> <block size> <number of iterations>
> >>>>
> >>>> for example:
> >>>>   echo '100 16 100' | mpiexec -n <np> ./scal
> >>>>
> >>>> Regards
> >>>>
> >>>>
> >>>> PS) This program behaves correctly with topspin/ciso software which is
> >>>> based on their infiniband stack and bases on mvapich1 version.
> >>>>
> >>>> We gona test the program in mvapiach1 from OSU
> >>>> --
> >>>> ********************************************************************
> >>>> *                                                                  *
> >>>> *  Bas van der Vlies                     e-mail: basv at sara.nl      *
> >>>> *  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
> >>>> *  Kruislaan 415                         fax:    +31 20 6683167    *
> >>>> *  1098 SJ Amsterdam                                               *
> >>>> *                                                                  *
> >>>> ********************************************************************
> >>>>
> >>
> >> --
> >> ********************************************************************
> >> *                                                                  *
> >> *  Bas van der Vlies                     e-mail: basv at sara.nl      *
> >> *  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
> >> *  Kruislaan 415                         fax:    +31 20 6683167    *
> >> *  1098 SJ Amsterdam                                               *
> >> *                                                                  *
> >> ********************************************************************
> >> _______________________________________________
> >> mvapich-discuss mailing list
> >> mvapich-discuss at cse.ohio-state.edu
> >> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>
> >
>
>
> --
> ********************************************************************
> *                                                                  *
> *  Bas van der Vlies                     e-mail: basv at sara.nl      *
> *  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
> *  Kruislaan 415                         fax:    +31 20 6683167    *
> *  1098 SJ Amsterdam                                               *
> *                                                                  *
> ********************************************************************
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>




More information about the mvapich-discuss mailing list