[mvapich-discuss] mvapich2 + MALLOC_CHECK_

Dan Kokron daniel.kokron at nasa.gov
Tue Jun 1 18:25:48 EDT 2010


I am attempting to debug an application that fails during MPI_Finalize.
After trying the usual debugging options (-g etc), I set MALLOC_CHECK_=2
to see what would happen.  It now fails with the following trace during
MPI_Init.  I didn't see any mention of this issue in the archives.
Maybe I missed it.

#0  0x00000000052e5bb5 in raise () from /lib64/libc.so.6
#1  0x00000000052e6fb0 in abort () from /lib64/libc.so.6
#2  0x00000000005718f9 in for__signal_handler ()
#3  <signal handler called>
#4  0x00000000052e5bb5 in raise () from /lib64/libc.so.6
#5  0x00000000052e6fb0 in abort () from /lib64/libc.so.6
#6  0x0000000000412126 in free_check (mem=0x4138000, caller=0x0) at hooks.c:274
#7  0x000000000041480a in free (mem=0x4138000) at mvapich_malloc.c:3443
#8  0x00000000004180ce in mvapich2_minit () at mem_hooks.c:86
#9  0x00000000005526a8 in MPIDI_CH3I_RDMA_init (pg=0x411f618, pg_rank=21) at rdma_iba_init.c:153
#10 0x000000000054d148 in MPIDI_CH3_Init (has_parent=0, pg=0x411f618, pg_rank=21) at ch3_init.c:161
#11 0x00000000004d9cce in MPID_Init (argc=0x0, argv=0x0, requested=0, provided=0x7feffba78, has_args=0x7feffba80, has_env=0x7feffba7c) at mpid_init.c:189
#12 0x0000000000435780 in MPIR_Init_thread (argc=0x0, argv=0x0, required=0, provided=0x0) at initthread.c:305
#13 0x0000000000434582 in PMPI_Init (argc=0x0, argv=0x0) at init.c:135
#14 0x0000000000410e0f in pmpi_init_ (ierr=0x7feffe774) at initf.c:129
#15 0x000000000040bdbf in gcrm_test_io () at gcrm_test_io.f90:27
#16 0x000000000040bcdc in main ()

Valgrind-3.5.0 gives the following

==21574== Conditional jump or move depends on uninitialised value(s)
==21574==    at 0x41182C: mem2chunk_check (hooks.c:165)
==21574==    by 0x4120C3: free_check (hooks.c:268)
==21574==    by 0x414809: free (mvapich_malloc.c:3443)
==21574==    by 0x4180CD: mvapich2_minit (mem_hooks.c:86)
==21574==    by 0x5526A7: MPIDI_CH3I_RDMA_init (rdma_iba_init.c:153)
==21574==    by 0x54D147: MPIDI_CH3_Init (ch3_init.c:161)
==21574==    by 0x4D9CCD: MPID_Init (mpid_init.c:189)
==21574==    by 0x43577F: MPIR_Init_thread (initthread.c:305)
==21574==    by 0x434581: PMPI_Init (init.c:135)
==21574==    by 0x410E0E: mpi_init_ (initf.c:129)
==21574==    by 0x40BDBE: MAIN__ (gcrm_test_io.f90:27)
==21574==    by 0x40BCDB: main (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
==21574==  Uninitialised value was created
==21574==    at 0x536FC7A: brk (in /lib64/libc-2.4.so)
==21574==    by 0x536FD41: sbrk (in /lib64/libc-2.4.so)
==21574==    by 0x418251: mvapich2_sbrk (mem_hooks.c:148)
==21574==    by 0x414058: sYSMALLOc (mvapich_malloc.c:2983)
==21574==    by 0x41647E: _int_malloc (mvapich_malloc.c:4318)
==21574==    by 0x411FE8: malloc_check (hooks.c:252)
==21574==    by 0x414607: malloc (mvapich_malloc.c:3395)
==21574==    by 0x4113AA: malloc_hook_ini (hooks.c:28)
==21574==    by 0x414607: malloc (mvapich_malloc.c:3395)
==21574==    by 0x57E382: for__get_vm (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
==21574==    by 0x5722B2: for_rtl_init_ (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)
==21574==    by 0x40BCD6: main (in /gpfsm/dhome/dkokron/play/mpi-io/gcrm_test_io.x)

I am using mvapich2-1.4-2010-05-25 configured as follows

./configure CC=icc CXX=icpc F77=ifort F90=ifort CFLAGS="-DRDMA_CM -fpic
-O0 -traceback -debug" CXXFLAGS="-DRDMA_CM -fpic -O0 -traceback -debug"
FFLAGS="-fpic -O0 -traceback -debug -nolib-inline -check bounds -check
uninit -fp-stack-check -ftrapuv" F90FLAGS="-fpic -O0 -traceback -debug
-nolib-inline -check bounds -check uninit -fp-stack-check -ftrapuv"
--prefix=/discover/nobackup/dkokron/mv2-1.4.1_debug
--enable-error-checking=all --enable-error-messages=all --enable-g=all
--enable-f77 --enable-f90 --enable-cxx --enable-mpe --enable-romio
--enable-threads=multiple --with-rdma=gen2

on Linux
2.6.16.60-0.42.5-smp

and Intel compilers (v 11.0.083)

Note that line number 86 in my mem_hooks.c is (I added some debug
prints)

    free(ptr_calloc);
--->free(ptr_valloc);  <---
    free(ptr_memalign);

-- 
Dan Kokron
Global Modeling and Assimilation Office
NASA Goddard Space Flight Center
Greenbelt, MD 20771
Daniel.S.Kokron at nasa.gov
Phone: (301) 614-5192
Fax:   (301) 614-5304



More information about the mvapich-discuss mailing list