[mvapich-discuss] mvapich2 and malloc

Sayantan Sur surs at cse.ohio-state.edu
Thu Jul 12 13:39:23 EDT 2007


Hi Bryan,

Thanks for such a detailed bug report! No, this is not a known issue. I 
was able to reproduce the problem and it is solved with this patch 
(attached). Can you let us know if this patch works for you?

Thanks,
Sayantan.

Bryan Putnam wrote:
> Hi,
>
> I've run across a piece of code that fails (hangs) with mvapich2,
> but runs succesfully with mpich2 and mpich.  The problem seems to
> occur when the number of bytes being allocated is greater than the
> largest 32-bit integer. So, even though we used 64-bit compilers
> to build this version of mvapich2, it appears that mvapich2 may be
> using its own version of malloc that isn't able to handle 64-bit
> addresses. Is this a known problem?
>
> Thanks,
> Bryan
>
> I've appended the code for your enjoyment in case you'd like to experiment 
> with it. It works OK with ncol=nrow=nsec=812, but fails with 813. In the 
> later case, the # of bytes exceeds the max 32-bit integer.
>
>       program alloc3
>       use mpi
> c     include 'mpif.h'
>       integer me, nt, mpierr, status(MPI_STATUS_SIZE)
>       integer*4 allocate_stat
>       real*4, allocatable :: x(:,:,:)
> c     real*8, allocatable :: x(:,:,:)
>
>       call MPI_INIT(mpierr)
>       call MPI_COMM_SIZE(MPI_COMM_WORLD, nt, mpierr)
>       call MPI_COMM_RANK(MPI_COMM_WORLD, me, mpierr)
>
> c     max int = 2147483647
> c     813**3 * 4 = 2149471188
> c     812**3 * 4 = 2141549312
>
>       NCOL = 813
>       NROW = 813
>       NSEC = 813
> c     NCOL = 812
> c     NROW = 812
> c     NSEC = 812
>
>       ALLOCATE(X(NCOL,NROW,NSEC), STAT=ALLOCATE_STAT)
>       IF( ALLOCATE_STAT .NE. 0)THEN
>         print *,'Can not allocated memory in GET_PRJSPFTS_G'
>       ENDIF
>       
>       
>       
>       DO I = 1,NSEC
>         DO J = 1,NROW
>           DO K = 1, NCOL
>             X(K,J,I) = 0.0
>           ENDDO
>         ENDDO
>         if(me.eq.0)print *,'finish initilize map3d sec',I
>       ENDDO
>
>       deallocate(x)
>
>       call MPI_FINALIZE(mpierr)
>       print *, "Done1"
>       stop
>       end
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>   


-- 
http://www.cse.ohio-state.edu/~surs

-------------- next part --------------
Index: dreg.c
===================================================================
--- dreg.c      (revision 1362)
+++ dreg.c      (working copy)
@@ -974,9 +974,9 @@
 
 #ifndef DISABLE_PTMALLOC
 
-void find_and_free_dregs_inside(void *buf, int len)
+void find_and_free_dregs_inside(void *buf, size_t len)
 {
-    int i;  
+    unsigned long i;
     unsigned long pagenum_low, pagenum_high;
     unsigned long  npages, begin, end;
     unsigned long user_low_a, user_high_a;
Index: dreg.h
===================================================================
--- dreg.h      (revision 1362)
+++ dreg.h      (working copy)
@@ -180,7 +180,7 @@
 dreg_entry *dreg_new_entry(void *buf, int len, int acl);
 
 #ifndef DISABLE_PTMALLOC
-void find_and_free_dregs_inside(void *buf, int len);
+void find_and_free_dregs_inside(void *buf, size_t len);
 #endif
 
 #endif                          /* _DREG_H */
Index: mem_hooks.c
===================================================================
--- mem_hooks.c (revision 1362)
+++ mem_hooks.c (working copy)
@@ -94,7 +94,7 @@
 
 #ifndef DISABLE_MUNMAP_HOOK
 
-int mvapich_munmap(void *buf, int len)
+int mvapich_munmap(void *buf, size_t len)
 {
     if(!mvapich_minfo.munmap) {
         set_real_munmap_ptr();
Index: mem_hooks.h
===================================================================
--- mem_hooks.h (revision 1362)
+++ mem_hooks.h (working copy)
@@ -42,7 +42,7 @@
 void mvapich_mfin(void);
 
 #ifndef DISABLE_MUNMAP_HOOK
-int mvapich_munmap(void *buf, int len);
+int mvapich_munmap(void *buf, size_t len);
 #endif
 
 #ifndef DISABLE_TRAP_SBRK



More information about the mvapich-discuss mailing list