[mvapich-discuss] Hang in MVAPICH2-2.2 in PSM with MPI_THREAD_MULTIPLE

Moody, Adam T. moody20 at llnl.gov
Thu Aug 3 21:32:24 EDT 2017


Hello MVAPICH team,
We've got a user reporting an hang in MPI_Test when using MVAPICH2-2.2 for PSM.  This only happens when using MPI_THREAD_MULTIPLE.  Although the app is using MPI_THREAD_MULTIPLE, they don't actually use threads in this case.  They do not hit the hang if they call MPI_Init instead.

The stack trace on the main thread looks like the following:

pthread_spin_lock
psm_irecv
MPID_Irecv
MPIDU_Sched_continue
MPIDU_Sched_progress
psm_progress_wait
MPIR_Test_impl
PMPI_Test

I think the main thread has perhaps deadlocked itself on the primary PSM lock, but I'm not entirely clear why.

Can you see where the main thread might have grabbed the psm lock the first time based on this stack trace?

Or perhaps it bailed out of a function w/o releasing the lock?
Thanks,
-Adam


More information about the mvapich-discuss mailing list