[mvapich-discuss] Strange error with MPI_REDUCE

amith rajith mamidala mamidala at cse.ohio-state.edu
Mon Dec 17 11:02:23 EST 2007


Hi Christian,

I am attaching the patch for MVAPICH2 (0.9.8) along with this mail. Can
you please try this out?

Thanks,
Amith.

On Wed, 12 Dec 2007, amith rajith mamidala wrote:

> Hi Christian,
>
> Thanks for trying out the patch.
> I will post a patch to MVAPICH2 in the next few days.
>
> Thanks,
> Amith.
>
> On Tue, 11 Dec 2007, Christian Boehme wrote:
>
> > Hi Amith,
> > > Can you also try the patch I am attaching with this mail and let us know
> > > how it works?
> > >
> >
> > Now that the patch seems to work, would it be possible to get a similar
> > patch for MVAPICH2 (version 0.9.8)? Many thanks
> >
> > Christian Boehme
> >
>
>
-------------- next part --------------
Index: reduce.c
===================================================================
--- reduce.c	(revision 1676)
+++ reduce.c	(working copy)
@@ -728,14 +728,14 @@
     MPI_Comm shmem_comm, leader_comm;
     MPID_Comm *shmem_commptr = 0, *leader_commptr = 0;
     int local_rank = -1, global_rank = -1, local_size=0, my_rank;
-    void* local_buf, *tmpbuf;
+    void* local_buf, *tmpbuf, *tmpbuf1;
     MPI_Aint   true_lb, true_extent, extent;
     MPI_User_function *uop;
     int stride = 0, i, is_commutative, size;
     MPID_Op *op_ptr;
     MPI_Status status;
     int leader_root, total_size, shmem_comm_rank;
-    MPIU_CHKLMEM_DECL(1);
+    MPIU_CHKLMEM_DECL(2);
 #ifdef HAVE_CXX_BINDING
     int is_cxx_uop = 0;
 #endif
@@ -921,6 +921,8 @@
                     global_rank = leader_commptr->rank;
                     MPIU_CHKLMEM_MALLOC(tmpbuf, void *, count*(MPIR_MAX(extent,true_extent)), mpi_errno, "receive buffer");
                     tmpbuf = (void *)((char*)tmpbuf - true_lb);
+                    MPIU_CHKLMEM_MALLOC(tmpbuf1, void *, count*(MPIR_MAX(extent,true_extent)), mpi_errno, "receive buffer");
+                    tmpbuf1 = (void *)((char*)tmpbuf1 - true_lb);
                     MPIR_Nest_incr();
                     mpi_errno = MPIR_Localcopy(sendbuf, count, datatype, tmpbuf,
                             count, datatype);
@@ -956,7 +958,7 @@
                     leader_root = comm_ptr->leader_rank[leader_of_root];
                     if (local_size != total_size){
                         MPIR_Nest_incr();
-                        mpi_errno = MPIR_Reduce(tmpbuf, recvbuf, count, datatype,
+                        mpi_errno = MPIR_Reduce(tmpbuf, tmpbuf1, count, datatype,
                                 op, leader_root, leader_commptr); 
                         MPIR_Nest_decr();
                     }
@@ -978,6 +980,13 @@
                     MPIDI_CH3I_SHMEM_COLL_SetGatherComplete(local_size, local_rank, shmem_comm_rank);
                 }
 
+                if ((local_rank == 0) && (root == my_rank)){
+                    MPIR_Nest_incr();
+                    mpi_errno = MPIR_Localcopy(tmpbuf1, count, datatype, recvbuf,
+                            count, datatype);
+                    MPIR_Nest_decr();
+                    goto fn_exit;
+                }
 
                 /* Copying data from leader to the root incase leader is
                  * not the root */
@@ -991,7 +1000,7 @@
                                     MPIR_REDUCE_TAG, comm );
                         }
                         else{
-                            mpi_errno  = MPIC_Send( recvbuf, count, datatype, root, 
+                            mpi_errno  = MPIC_Send( tmpbuf1, count, datatype, root, 
                                     MPIR_REDUCE_TAG, comm );
                         }
                     }


More information about the mvapich-discuss mailing list