[mvapich-discuss] Deadlock in while calling malloc?

Martin Cuma martin.cuma at utah.edu
Tue Nov 10 16:42:28 EST 2015


> The mapping method should be fine.  Can you verify that you modified the program to call
> MPI_Init_thread?

Yes, it does. I printed:
  MPI_Init_thread(&argc, &argv,MPI_THREAD_FUNNELED,&thread_status);
  printf("Initialized MPI_THREAD_FUNNELED %d\n",thread_status);

Initialized MPI_THREAD_FUNNELED 1

> Also let us know how many processes you're running per node.

Ideally I run 4 processes per node, 4-6 threads per process (on 16-24 core 
nodes). The deadlock starts showing up at around 200 processes, to be sure 
I've been using 240 now. In the current tests I was only able to get 40 
12 core nodes so I ran 6 procs/node and 4 cores/proc which effectively 
oversubscribed the node by 2 (24 threads/12 physical cores). That seemed 
to increase the chances of the deadlock as I got about 5 processes to 
deadlock, as compared to 1-2 when mapping one thread/core. In the deadlock 
processes, I only see 1 thread at the malloc, all other are at the 
mp_barrier presumably at the end of the OMP parallel section.

The simulation that does this is kind of a beast, so, it may be easier to 
write a simpler demo code to reproduce this, though, I do allocate about 
20 GB worth of thread shared arrays which are then slowly filled up by 
those threaded calculation in which each thread has its own private small 
array to calculate, and then copy to appropriate portion of the large 
shared array. I am wondering is this also does not have an effect on how 
fast the malloc performs, increasing chances of a race condition.

Anyway, let me know what you think and we can go from there.

MC


More information about the mvapich-discuss mailing list