[mvapich-discuss] mvapich2 + MALLOC_CHECK_

Sayantan Sur surs at cse.ohio-state.edu
Thu Jun 3 14:14:34 EDT 2010


Hi Dan,

On Thu, Jun 3, 2010 at 1:40 PM, Dan Kokron <daniel.kokron at nasa.gov> wrote:
> I originally enabled the MALLOC_CHECK_ feature in order to investigate a
> failure during MPI_finalize.  Setting it to 1 should allow the program
> to proceed, but won't provide much new information regarding the
> finalize failure.
>
> I guess I'm stuck in the middle here.  The application I am debugging is
> pure MPI, so having a less efficient multi-threaded ptmalloc isn't a
> problem.  Can you estimate how much work it would be for me to swap out
> ptmalloc2 for ptmalloc3 in my mvapich2 sandbox?
>

I would estimate that switching out ptmalloc versions is non-trivial.
That is because we have a *very careful* integration of MVAPICH2
registration cache logic with ptmalloc. We need to know when the
application is calling free and if the memory freed is still cached
into MPI. Further, we have to protect against recursion in free when
de-registering buffers from the InfiniBand driver (since that may call
free again).

We can directly help you debug the finalization problem instead of
going this route. We don't want you to be stuck in the middle :-)

> p.p.s.
> While we are on the topic of memory allocators, has your group looked
> into using HOARD? http://www.hoard.org/
>

That looks pretty cool. There is an effort in the Linux RDMA community
to provide kernel support for safe RDMA use. If that is successful,
then we probably can leave the choice of memory allocator to users.

http://www.mail-archive.com/linux-rdma@vger.kernel.org/msg03554.html

> p.p.p.s.
> The finalize failure has the following trace
>
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
> Image              PC                Routine            Line        Source
> libibverbs.so.1    00002B527AA70F2E  Unknown               Unknown  Unknown
> GEOSgcm.x          000000000A9AB4F2  MPIDI_CH3I_CM_Fin        1501  rdma_iba_init.c

This is interesting, and probably the place we should concentrate.
Could you please send us the line number at which the segv occurs?
e.g.

(gdb) f 0
(gdb) list

Thanks.

> GEOSgcm.x          000000000A9A1412  MPIDI_CH3_Finaliz          57  ch3_finalize.c
> GEOSgcm.x          000000000A962CA9  MPID_Finalize             170  mpid_finalize.c
> GEOSgcm.x          000000000A8A68BC  PMPI_Finalize             168  finalize.c
> GEOSgcm.x          000000000A806EAF  MPI_Finalize             1534  TauMpi.c
> GEOSgcm.x          0000000008C2BBFA  _ZN5ESMCI3VMK8fin         470  ESMCI_VMKernel.C
> GEOSgcm.x          0000000008C3E07D  _ZN5ESMCI2VM8fina        1459  ESMCI_VM.C
> GEOSgcm.x          0000000009D39246  c_esmc_vmfinalize         826  ESMCI_VM_F.C
> GEOSgcm.x          0000000009AF7940  esmf_vmmod_mp_esm        6152  ESMF_VM.F90
> GEOSgcm.x          00000000095C66A0  esmf_initmod_mp_e         513  ESMF_Init.F90
> GEOSgcm.x          0000000008694D35  mapl_capmod_mp_ma         618  MAPL_Cap.pp.inst.F90
> GEOSgcm.x          00000000004A4B75  MAIN__                    171  GEOSgcm.pp.inst.F90
>
>
> On Wed, 2010-06-02 at 19:07 -0500, Sayantan Sur wrote:
>> Hi Dan,
>>
>> I looked into this issue. It seems that this is a bug in ptmalloc2.
>> MVAPICH/MVAPICH2 uses ptmalloc2 implementation of malloc to provide
>> safe registration/de-registration. It seems that the malloc checking
>> for memory allocated through 'valloc' might be buggy.
>>
>> This bug seems to have been solved in ptmalloc3. The last time we
>> looked into upgrading to ptmalloc3, we saw this message on ptmalloc's
>> website. "In multi-thread Applications, ptmalloc2 is currently
>> slightly more memory-efficient than ptmalloc3."
>> [http://www.malloc.de/en/] We decided not to upgrade to ptmalloc3.
>>
>> If you use MALLOC_CHECK_=1, then you will get a warning, but your
>> program will proceed. Presumably, you chose to use this checking to
>> find bugs in your MPI program? Maybe you can overlook this one warning
>> for now and let us know how it works. We will also investigate
>> ptmalloc3 and plan to incorporate this in future release.
>>
>> Thanks.
>>
>> On Wed, Jun 2, 2010 at 3:47 PM, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
>> > Hi Dan,
>> >
>> > Thanks for reporting this. I don't think anyone has reported this
>> > earlier. I was able to reproduce on our systems, and am currently
>> > looking into this issue.
>> >
>> > Thanks.
>> >
>> > On Tue, Jun 1, 2010 at 6:25 PM, Dan Kokron <daniel.kokron at nasa.gov> wrote:
>> >> I am attempting to debug an application that fails during MPI_Finalize.
>> >> After trying the usual debugging options (-g etc), I set MALLOC_CHECK_=2
>> >> to see what would happen.  It now fails with the following trace during
>> >> MPI_Init.  I didn't see any mention of this issue in the archives.
>> >> Maybe I missed it.
>> >>
>> >> #0  0x00000000052e5bb5 in raise () from /lib64/libc.so.6
>> >> #1  0x00000000052e6fb0 in abort () from /lib64/libc.so.6
>> >> #2  0x00000000005718f9 in for__signal_handler ()
>> >> #3  <signal handler called>
>> >> #4  0x00000000052e5bb5 in raise () from /lib64/libc.so.6
>> >> #5  0x00000000052e6fb0 in abort () from /lib64/libc.so.6
>> >> #6  0x0000000000412126 in free_check (mem=0x4138000, caller=0x0) at hooks.c:274
>> >> #7  0x000000000041480a in free (mem=0x4138000) at mvapich_malloc.c:3443
>> >> #8  0x00000000004180ce in mvapich2_minit () at mem_hooks.c:86
>> >> #9  0x00000000005526a8 in MPIDI_CH3I_RDMA_init (pg=0x411f618, pg_rank=21) at rdma_iba_init.c:153
>> >> #10 0x000000000054d148 in MPIDI_CH3_Init (has_parent=0, pg=0x411f618, pg_rank=21) at ch3_init.c:161
>> >> #11 0x00000000004d9cce in MPID_Init (argc=0x0, argv=0x0, requested=0, provided=0x7feffba78, has_args=0x7feffba80, has_env=0x7feffba7c) at mpid_init.c:189
>> >> #12 0x0000000000435780 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0, provided=0x0) at initthread.c:305
>> >> #13 0x0000000000434582 in PMPI_Init (argc=0x0, argv=0x0) at init.c:135
>> >> #14 0x0000000000410e0f in pmpi_init_ (ierr=0x7feffe774) at initf.c:129
>> >> #15 0x000000000040bdbf in gcrm_test_io () at gcrm_test_io.f90:27
>> >> #16 0x000000000040bcdc in main ()
>> >>
>> >> Valgrind-3.5.0 gives the following
>> >>
>> >> ==21574== Conditional jump or move depends on uninitialised value(s)
>> >> ==21574==    at 0x41182C: mem2chunk_check (hooks.c:165)
>> >> ==21574==    by 0x4120C3: free_check (hooks.c:268)
>> >> ==21574==    by 0x414809: free (mvapich_malloc.c:3443)
>> >> ==21574==    by 0x4180CD: mvapich2_minit (mem_hooks.c:86)
>> >> ==21574==    by 0x5526A7: MPIDI_CH3I_RDMA_init (rdma_iba_init.c:153)
>> >> ==21574==    by 0x54D147: MPIDI_CH3_Init (ch3_init.c:161)
>> >> ==21574==    by 0x4D9CCD: MPID_Init (mpid_init.c:189)
>> >> ==21574==    by 0x43577F: MPIR_Init_thread (initthread.c:305)
>> >> ==21574==    by 0x434581: PMPI_Init (init.c:135)
>> >> ==21574==    by 0x410E0E: mpi_init_ (initf.c:129)
>> >> ==21574==    by 0x40BDBE: MAIN__ (gcrm_test_io.f90:27)
>> >> ==21574==    by 0x40BCDB: main (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
>> >> ==21574==  Uninitialised value was created
>> >> ==21574==    at 0x536FC7A: brk (in /lib64/libc-2.4.so)
>> >> ==21574==    by 0x536FD41: sbrk (in /lib64/libc-2.4.so)
>> >> ==21574==    by 0x418251: mvapich2_sbrk (mem_hooks.c:148)
>> >> ==21574==    by 0x414058: sYSMALLOc (mvapich_malloc.c:2983)
>> >> ==21574==    by 0x41647E: _int_malloc (mvapich_malloc.c:4318)
>> >> ==21574==    by 0x411FE8: malloc_check (hooks.c:252)
>> >> ==21574==    by 0x414607: malloc (mvapich_malloc.c:3395)
>> >> ==21574==    by 0x4113AA: malloc_hook_ini (hooks.c:28)
>> >> ==21574==    by 0x414607: malloc (mvapich_malloc.c:3395)
>> >> ==21574==    by 0x57E382: for__get_vm (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
>> >> ==21574==    by 0x5722B2: for_rtl_init_ (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
>> >> ==21574==    by 0x40BCD6: main (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
>> >>
>> >> I am using mvapich2-1.4-2010-05-25 configured as follows
>> >>
>> >> ./configure CC=icc CXX=icpc F77=ifort F90=ifort CFLAGS="-DRDMA_CM -fpic
>> >> -O0 -traceback -debug" CXXFLAGS="-DRDMA_CM -fpic -O0 -traceback -debug"
>> >> FFLAGS="-fpic -O0 -traceback -debug -nolib-inline -check bounds -check
>> >> uninit -fp-stack-check -ftrapuv" F90FLAGS="-fpic -O0 -traceback -debug
>> >> -nolib-inline -check bounds -check uninit -fp-stack-check -ftrapuv"
>> >> --prefix=/discover/nobackup/dkokron/mv2-1.4.1_debug
>> >> --enable-error-checking=all --enable-error-messages=all --enable-g=all
>> >> --enable-f77 --enable-f90 --enable-cxx --enable-mpe --enable-romio
>> >> --enable-threads=multiple --with-rdma=gen2
>> >>
>> >> on Linux
>> >> 2.6.16.60-0.42.5-smp
>> >>
>> >> and Intel compilers (v 11.0.083)
>> >>
>> >> Note that line number 86 in my mem_hooks.c is (I added some debug
>> >> prints)
>> >>
>> >>    free(ptr_calloc);
>> >> --->free(ptr_valloc);  <---
>> >>    free(ptr_memalign);
>> >>
>> >> --
>> >> Dan Kokron
>> >> Global Modeling and Assimilation Office
>> >> NASA Goddard Space Flight Center
>> >> Greenbelt, MD 20771
>> >> Daniel.S.Kokron at nasa.gov
>> >> Phone: (301) 614-5192
>> >> Fax:   (301) 614-5304
>> >>
>> >> _______________________________________________
>> >> mvapich-discuss mailing list
>> >> mvapich-discuss at cse.ohio-state.edu
>> >> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>> >>
>> >>
>> >
>> >
>> >
>> > --
>> > Sayantan Sur
>> >
>> > Research Scientist
>> > Department of Computer Science
>> > The Ohio State University.
>> >
>>
>>
>>
> --
> Dan Kokron
> Global Modeling and Assimilation Office
> NASA Goddard Space Flight Center
> Greenbelt, MD 20771
> Daniel.S.Kokron at nasa.gov
> Phone: (301) 614-5192
> Fax:   (301) 614-5304
>
>



-- 
Sayantan Sur

Research Scientist
Department of Computer Science
The Ohio State University.



More information about the mvapich-discuss mailing list