[mvapich-discuss] mvapich2 + MALLOC_CHECK_

Dan Kokron daniel.kokron at nasa.gov
Thu Jun 3 17:36:43 EDT 2010


The MPI_finalize failure happens at line 1501 of rdma_iba_init.c which
in my sandbox is

            if (vc->mrail.rfp.RDMA_send_buf_mr[hca_index])
            {
 --->           ibv_dereg_mr(vc->mrail.rfp.RDMA_send_buf_mr[hca_index]);
            }

I deleted the core files during subsequent attempts.  I have a job
queued now to reproduce them. 

Have a good weekend.
Dan

On Thu, 2010-06-03 at 13:14 -0500, Sayantan Sur wrote:
> Hi Dan,
> 
> On Thu, Jun 3, 2010 at 1:40 PM, Dan Kokron <daniel.kokron at nasa.gov> wrote:
> > I originally enabled the MALLOC_CHECK_ feature in order to investigate a
> > failure during MPI_finalize.  Setting it to 1 should allow the program
> > to proceed, but won't provide much new information regarding the
> > finalize failure.
> >
> > I guess I'm stuck in the middle here.  The application I am debugging is
> > pure MPI, so having a less efficient multi-threaded ptmalloc isn't a
> > problem.  Can you estimate how much work it would be for me to swap out
> > ptmalloc2 for ptmalloc3 in my mvapich2 sandbox?
> >
> 
> I would estimate that switching out ptmalloc versions is non-trivial.
> That is because we have a *very careful* integration of MVAPICH2
> registration cache logic with ptmalloc. We need to know when the
> application is calling free and if the memory freed is still cached
> into MPI. Further, we have to protect against recursion in free when
> de-registering buffers from the InfiniBand driver (since that may call
> free again).
> 
> We can directly help you debug the finalization problem instead of
> going this route. We don't want you to be stuck in the middle :-)
> 
> > p.p.s.
> > While we are on the topic of memory allocators, has your group looked
> > into using HOARD? http://www.hoard.org/
> >
> 
> That looks pretty cool. There is an effort in the Linux RDMA community
> to provide kernel support for safe RDMA use. If that is successful,
> then we probably can leave the choice of memory allocator to users.
> 
> http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg03554.html
> 
> > p.p.p.s.
> > The finalize failure has the following trace
> >
> > forrtl: severe (174): SIGSEGV, segmentation fault occurred
> > Image              PC                Routine            Line        Source
> > libibverbs.so.1    00002B527AA70F2E  Unknown               Unknown  Unknown
> > GEOSgcm.x          000000000A9AB4F2  MPIDI_CH3I_CM_Fin        1501  rdma_iba_init.c
> 
> This is interesting, and probably the place we should concentrate.
> Could you please send us the line number at which the segv occurs?
> e.g.
> 
> (gdb) f 0
> (gdb) list
> 
> Thanks.
> 
> > GEOSgcm.x          000000000A9A1412  MPIDI_CH3_Finaliz          57  ch3_finalize.c
> > GEOSgcm.x          000000000A962CA9  MPID_Finalize             170  mpid_finalize.c
> > GEOSgcm.x          000000000A8A68BC  PMPI_Finalize             168  finalize.c
> > GEOSgcm.x          000000000A806EAF  MPI_Finalize             1534  TauMpi.c
> > GEOSgcm.x          0000000008C2BBFA  _ZN5ESMCI3VMK8fin         470  ESMCI_VMKernel.C
> > GEOSgcm.x          0000000008C3E07D  _ZN5ESMCI2VM8fina        1459  ESMCI_VM.C
> > GEOSgcm.x          0000000009D39246  c_esmc_vmfinalize         826  ESMCI_VM_F.C
> > GEOSgcm.x          0000000009AF7940  esmf_vmmod_mp_esm        6152  ESMF_VM.F90
> > GEOSgcm.x          00000000095C66A0  esmf_initmod_mp_e         513  ESMF_Init.F90
> > GEOSgcm.x          0000000008694D35  mapl_capmod_mp_ma         618  MAPL_Cap.pp.inst.F90
> > GEOSgcm.x          00000000004A4B75  MAIN__                    171  GEOSgcm.pp.inst.F90
> >
> >
> > On Wed, 2010-06-02 at 19:07 -0500, Sayantan Sur wrote:
> >> Hi Dan,
> >>
> >> I looked into this issue. It seems that this is a bug in ptmalloc2.
> >> MVAPICH/MVAPICH2 uses ptmalloc2 implementation of malloc to provide
> >> safe registration/de-registration. It seems that the malloc checking
> >> for memory allocated through 'valloc' might be buggy.
> >>
> >> This bug seems to have been solved in ptmalloc3. The last time we
> >> looked into upgrading to ptmalloc3, we saw this message on ptmalloc's
> >> website. "In multi-thread Applications, ptmalloc2 is currently
> >> slightly more memory-efficient than ptmalloc3."
> >> [http://www.malloc.de/en/] We decided not to upgrade to ptmalloc3.
> >>
> >> If you use MALLOC_CHECK_=1, then you will get a warning, but your
> >> program will proceed. Presumably, you chose to use this checking to
> >> find bugs in your MPI program? Maybe you can overlook this one warning
> >> for now and let us know how it works. We will also investigate
> >> ptmalloc3 and plan to incorporate this in future release.
> >>
> >> Thanks.
> >>
> >> On Wed, Jun 2, 2010 at 3:47 PM, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
> >> > Hi Dan,
> >> >
> >> > Thanks for reporting this. I don't think anyone has reported this
> >> > earlier. I was able to reproduce on our systems, and am currently
> >> > looking into this issue.
> >> >
> >> > Thanks.
> >> >
> >> > On Tue, Jun 1, 2010 at 6:25 PM, Dan Kokron <daniel.kokron at nasa.gov> wrote:
> >> >> I am attempting to debug an application that fails during MPI_Finalize.
> >> >> After trying the usual debugging options (-g etc), I set MALLOC_CHECK_=2
> >> >> to see what would happen.  It now fails with the following trace during
> >> >> MPI_Init.  I didn't see any mention of this issue in the archives.
> >> >> Maybe I missed it.
> >> >>
> >> >> #0  0x00000000052e5bb5 in raise () from /lib64/libc.so.6
> >> >> #1  0x00000000052e6fb0 in abort () from /lib64/libc.so.6
> >> >> #2  0x00000000005718f9 in for__signal_handler ()
> >> >> #3  <signal handler called>
> >> >> #4  0x00000000052e5bb5 in raise () from /lib64/libc.so.6
> >> >> #5  0x00000000052e6fb0 in abort () from /lib64/libc.so.6
> >> >> #6  0x0000000000412126 in free_check (mem=0x4138000, caller=0x0) at hooks.c:274
> >> >> #7  0x000000000041480a in free (mem=0x4138000) at mvapich_malloc.c:3443
> >> >> #8  0x00000000004180ce in mvapich2_minit () at mem_hooks.c:86
> >> >> #9  0x00000000005526a8 in MPIDI_CH3I_RDMA_init (pg=0x411f618, pg_rank=21) at rdma_iba_init.c:153
> >> >> #10 0x000000000054d148 in MPIDI_CH3_Init (has_parent=0, pg=0x411f618, pg_rank=21) at ch3_init.c:161
> >> >> #11 0x00000000004d9cce in MPID_Init (argc=0x0, argv=0x0, requested=0, provided=0x7feffba78, has_args=0x7feffba80, has_env=0x7feffba7c) at mpid_init.c:189
> >> >> #12 0x0000000000435780 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0, provided=0x0) at initthread.c:305
> >> >> #13 0x0000000000434582 in PMPI_Init (argc=0x0, argv=0x0) at init.c:135
> >> >> #14 0x0000000000410e0f in pmpi_init_ (ierr=0x7feffe774) at initf.c:129
> >> >> #15 0x000000000040bdbf in gcrm_test_io () at gcrm_test_io.f90:27
> >> >> #16 0x000000000040bcdc in main ()
> >> >>
> >> >> Valgrind-3.5.0 gives the following
> >> >>
> >> >> ==21574== Conditional jump or move depends on uninitialised value(s)
> >> >> ==21574==    at 0x41182C: mem2chunk_check (hooks.c:165)
> >> >> ==21574==    by 0x4120C3: free_check (hooks.c:268)
> >> >> ==21574==    by 0x414809: free (mvapich_malloc.c:3443)
> >> >> ==21574==    by 0x4180CD: mvapich2_minit (mem_hooks.c:86)
> >> >> ==21574==    by 0x5526A7: MPIDI_CH3I_RDMA_init (rdma_iba_init.c:153)
> >> >> ==21574==    by 0x54D147: MPIDI_CH3_Init (ch3_init.c:161)
> >> >> ==21574==    by 0x4D9CCD: MPID_Init (mpid_init.c:189)
> >> >> ==21574==    by 0x43577F: MPIR_Init_thread (initthread.c:305)
> >> >> ==21574==    by 0x434581: PMPI_Init (init.c:135)
> >> >> ==21574==    by 0x410E0E: mpi_init_ (initf.c:129)
> >> >> ==21574==    by 0x40BDBE: MAIN__ (gcrm_test_io.f90:27)
> >> >> ==21574==    by 0x40BCDB: main (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
> >> >> ==21574==  Uninitialised value was created
> >> >> ==21574==    at 0x536FC7A: brk (in /lib64/libc-2.4.so)
> >> >> ==21574==    by 0x536FD41: sbrk (in /lib64/libc-2.4.so)
> >> >> ==21574==    by 0x418251: mvapich2_sbrk (mem_hooks.c:148)
> >> >> ==21574==    by 0x414058: sYSMALLOc (mvapich_malloc.c:2983)
> >> >> ==21574==    by 0x41647E: _int_malloc (mvapich_malloc.c:4318)
> >> >> ==21574==    by 0x411FE8: malloc_check (hooks.c:252)
> >> >> ==21574==    by 0x414607: malloc (mvapich_malloc.c:3395)
> >> >> ==21574==    by 0x4113AA: malloc_hook_ini (hooks.c:28)
> >> >> ==21574==    by 0x414607: malloc (mvapich_malloc.c:3395)
> >> >> ==21574==    by 0x57E382: for__get_vm (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
> >> >> ==21574==    by 0x5722B2: for_rtl_init_ (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
> >> >> ==21574==    by 0x40BCD6: main (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
> >> >>
> >> >> I am using mvapich2-1.4-2010-05-25 configured as follows
> >> >>
> >> >> ./configure CC=icc CXX=icpc F77=ifort F90=ifort CFLAGS="-DRDMA_CM -fpic
> >> >> -O0 -traceback -debug" CXXFLAGS="-DRDMA_CM -fpic -O0 -traceback -debug"
> >> >> FFLAGS="-fpic -O0 -traceback -debug -nolib-inline -check bounds -check
> >> >> uninit -fp-stack-check -ftrapuv" F90FLAGS="-fpic -O0 -traceback -debug
> >> >> -nolib-inline -check bounds -check uninit -fp-stack-check -ftrapuv"
> >> >> --prefix=/discover/nobackup/dkokron/mv2-1.4.1_debug
> >> >> --enable-error-checking=all --enable-error-messages=all --enable-g=all
> >> >> --enable-f77 --enable-f90 --enable-cxx --enable-mpe --enable-romio
> >> >> --enable-threads=multiple --with-rdma=gen2
> >> >>
> >> >> on Linux
> >> >> 2.6.16.60-0.42.5-smp
> >> >>
> >> >> and Intel compilers (v 11.0.083)
> >> >>
> >> >> Note that line number 86 in my mem_hooks.c is (I added some debug
> >> >> prints)
> >> >>
> >> >>    free(ptr_calloc);
> >> >> --->free(ptr_valloc);  <---
> >> >>    free(ptr_memalign);
> >> >>
> >> >> --
> >> >> Dan Kokron
> >> >> Global Modeling and Assimilation Office
> >> >> NASA Goddard Space Flight Center
> >> >> Greenbelt, MD 20771
> >> >> Daniel.S.Kokron at nasa.gov
> >> >> Phone: (301) 614-5192
> >> >> Fax:   (301) 614-5304
> >> >>
> >> >> _______________________________________________
> >> >> mvapich-discuss mailing list
> >> >> mvapich-discuss at cse.ohio-state.edu
> >> >> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >> >>
> >> >>
> >> >
> >> >
> >> >
> >> > --
> >> > Sayantan Sur
> >> >
> >> > Research Scientist
> >> > Department of Computer Science
> >> > The Ohio State University.
> >> >
> >>
> >>
> >>
> > --
> > Dan Kokron
> > Global Modeling and Assimilation Office
> > NASA Goddard Space Flight Center
> > Greenbelt, MD 20771
> > Daniel.S.Kokron at nasa.gov
> > Phone: (301) 614-5192
> > Fax:   (301) 614-5304
> >
> >
> 
> 
> 
-- 
Dan Kokron
Global Modeling and Assimilation Office
NASA Goddard Space Flight Center
Greenbelt, MD 20771
Daniel.S.Kokron at nasa.gov
Phone: (301) 614-5192
Fax:   (301) 614-5304



More information about the mvapich-discuss mailing list