[mvapich-discuss] Infinite loop in ptmalloc

Adam T. Moody moody20 at llnl.gov
Mon Dec 15 20:53:53 EST 2014


Hello MVAPICH team,
We have a code using MVAPICH2-1.9 that forks a process whose child then 
dies after it eventually consumes all available memory.  If I SIGSTOP 
the child and attach to it before it dies, I can see from its stack 
trace that it's apparently in an infinite recursion loop consisting of 
calls to:

malloc_atfork()
malloc() at mvapich_malloc.c:3403

I can see that mvapich_malloc.c:3403 is the last line of the following, 
which invokes the __malloc_hook function pointer:

  __malloc_ptr_t (*hook) __MALLOC_P ((size_t, __const __malloc_ptr_t)) =
    __malloc_hook;
  if (hook != NULL)
    return (*hook)(bytes, RETURN_ADDRESS (0));

 From the stack trace, I can deduce that __malloc_hook must be pointing 
to malloc_atfork().

Then looking at the malloc_atfork() impelmentation, I can see that it 
calls public_mALLOc() in it's else clause, which seems like it may be 
the code path leading to the loop:

  } else {
    /* Suspend the thread until the `atfork' handlers have completed.
       By that time, the hooks will have been reset as well, so that
       mALLOc() can be used again. */
    (void)mutex_lock(&list_lock);
    (void)mutex_unlock(&list_lock);
    return public_mALLOc(sz);
  }

Do you have ideas how this might happen?  Can you imagine a case that 
would lead to a loop here?

I see a lock and followed immediately by an unlock.  Does this lock 
really protect anything?
Thanks,
-Adam


More information about the mvapich-discuss mailing list