[mvapich-discuss] leaked context IDs detected

Hari Subramoni subramoni.1 at osu.edu
Thu Sep 3 10:23:58 EDT 2015


Hi Amit,

This looks like an application issue. From the debug messages you've
posted, it looks like the application is creating extra communicators
(other that MPI_COMM_WORLD) through MPI_Comm_create, MPI_Comm_split etc
without freeing them (using MPI_Comm_free) before calling MPI_Finalize. Can
you please go through your code to see if there are any such instances?

Regards,
Hari.

On Thu, Sep 3, 2015 at 10:02 AM, Kumar, Amit <ahkumar at mail.smu.edu> wrote:

> Hi Hari,
>
>
>
> Thank you for your response. I tried it with the latest release(2.2a) of
> MVAPICH2, but find myself with similar errors. Here are additional details
> you requested.
>
> Using GNU GCC v5.1.0 compiler
>
>
>
> #mpiname -a
>
> MVAPICH2 2.2a Mon Aug 17 20:00:00 EDT 2015 ch3:mrail
>
>
>
> Compilation
>
> CC: gcc    -DNDEBUG -DNVALGRIND -g -O2
>
> CXX: g++   -DNDEBUG -DNVALGRIND -g -O2
>
> F77: gfortran   -g -O2
>
> FC: gfortran   -g -O2
>
>
>
> Configuration
>
> --prefix=/grid/software/mvapich2/2.2a/gcc-5.1.0 --enable-fortran=all
> --enable-cxx --enable-romio --enable-mpe --with-pm=slurm --with-pmi=pmi2
> --enable-threads=multiple --enable-fast --enable-g=yes --with-hwloc
>
>
>
> Please let me know if you need more information.
>
> Regards,
> Amit
>
>
>
> Sample Errors from a simple NAMD run.
>
> ….
>
> All tests completed, exiting
>
> [Partition 0][Node 0] End of program
>
> [0] 32 at [0x00000000018e3888], src/mpid/ch3/src/mpid_vc.c[132]
>
> [1] 56 at [0x00000000028aa8b8], src/mpid/ch3/src/mpid_vc.c[132]
>
> [1] 16 at [0x00000000028a3028], src/util/procmap/local_proc.c[93]
>
> [1] 16 at [0x00000000028a8a88], src/util/procmap/local_proc.c[92]
>
> [1] 504 at [0x00000000028b1f58], src/mpi/comm/commutil.c[337]
>
> [1] 504 at [0x00000000028b1748], src/mpi/comm/commutil.c[337]
>
> [1] 56 at [0x00000000028a6bb8], src/mpid/ch3/src/mpid_vc.c[132]
>
> [2] 56 at [0x0000000001d0d8b8], src/mpid/ch3/src/mpid_vc.c[132]
>
> [2] 16 at [0x0000000001d06028], src/util/procmap/local_proc.c[93]
>
> [2] 16 at [0x0000000001d0ba88], src/util/procmap/local_proc.c[92]
>
> [2] 504 at [0x0000000001d14f58], src/mpi/comm/commutil.c[337]
>
> [2] 504 at [0x0000000001d14748], src/mpi/comm/commutil.c[337]
>
> [2] 56 at [0x0000000001d09bb8], src/mpid/ch3/src/mpid_vc.c[132]
>
> [0] 56 at [0x00000000018a3198], src/mpid/ch3/src/mpid_vc.c[132]
>
> [0] 16 at [0x00000000018960e8], src/util/procmap/local_proc.c[93]
>
> [0] 16 at [0x0000000001896028], src/util/procmap/local_proc.c[92]
>
> [0] 504 at [0x00000000018a4c88], src/mpi/comm/commutil.c[337]
>
> [0] 504 at [0x00000000018a49e8], src/mpi/comm/commutil.c[337]
>
> [0] 504 at [0x00000000018a4748], src/mpi/comm/commutil.c[337]
>
> leaked context IDs detected: mask=0x7f22cc6912a0 mask[0]=0x7fffffff
>
> In direct memory block for handle type COMM, 2 handles are still allocated
>
> leaked context IDs detected: mask=0x7f698f0d92a0 mask[0]=0x7fffffff
>
> In direct memory block for handle type COMM, 2 handles are still allocated
>
> [0] 56 at [0x0000000001898328], src/mpid/ch3/src/mpid_vc.c[132]
>
> leaked context IDs detected: mask=0x7fbf6ca4c2a0 mask[0]=0x7fffffff
>
> In direct memory block for handle type COMM, 3 handles are still allocated
>
> [3] 56 at [0x0000000000b578b8], src/mpid/ch3/src/mpid_vc.c[132]
>
> [3] 16 at [0x0000000000b50028], src/util/procmap/local_proc.c[93]
>
> [3] 16 at [0x0000000000b55a88], src/util/procmap/local_proc.c[92]
>
> [3] 504 at [0x0000000000b5ef58], src/mpi/comm/commutil.c[337]
>
> [3] 504 at [0x0000000000b5e748], src/mpi/comm/commutil.c[337]
>
> [3] 56 at [0x0000000000b53bb8], src/mpid/ch3/src/mpid_vc.c[132]
>
> leaked context IDs detected: mask=0x7f8f401152a0 mask[0]=0x7fffffff
>
> In direct memory block for handle type COMM, 2 handles are still allocated
>
>
>
> *From:* hari.subramoni at gmail.com [mailto:hari.subramoni at gmail.com] *On
> Behalf Of *Hari Subramoni
> *Subject:* Re: [mvapich-discuss] leaked context IDs detected
>
>
>
> Hello Amit,
>
> Can you please try with the latest 2.2a release of MVAPICH2? We made some
> memory leak related fixes there. If the memory leaks don't go away, could
> you please send us the following?
>
> 1. A reproducer
> 2. The output of mpiname -a
> 3. The compilers and the version of compilers used
>
> Regards,
> Hari.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20150903/20164b23/attachment-0001.html>


More information about the mvapich-discuss mailing list