[mvapich-discuss] MPI_THREAD_MULTIPLE support

Martin Cuma martin.cuma at utah.edu
Mon Nov 7 15:50:42 EST 2011


Hi Krishna,

thanks for your reply. I do disable the affinity, use MVAPICH2 1.7 built 
with PGI compilers, though, I get this with Intel as well.

Later on last night, I discovered that the Nemesis device works fine and 
even performs faster (my code is heavy in Allgatherv). So, I am happy to 
use the OFA-IB-Nemesis device instead of the OFA-IB-CH3 where I see this 
error.

If you still want to pursue this and file a bug report, let me know and 
I'll try to package up the code so that you can reproduce it.

Thanks,
MC

On Mon, 7 Nov 2011, Krishna Kandalla wrote:

> Hi Martin,
>        Thanks for reporting this issue. Could you please let us know which version of MVAPICH2
> you are using?  Also, as described in our userguide
> (http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.7.html#x1-700006.14), could
> you please set MV2_ENABLE_AFFINITY=0 and try re-running your application.
> 
> Thanks,
> Krishna
> 
> On Sun, Nov 6, 2011 at 10:49 PM, Martin Cuma <martin.cuma at utah.edu> wrote:
>       Hello,
>
>       I am wondering how well is MPI_THREAD_MULTIPLE supported in MVAPICH2 built with
>       --with-rdma=gen2. The reason I am asking is that I am getting deadlock in my code,
>       which does communication from multiple threads/processes. The code is running fine
>       with MPICH2, nemesis device.
>
>       The code does bunch of isends/irecvs, the thread number is always the same among
>       the communication pair, the processes involved differ. For example, if process 0
>       sends to 1, then all threads (say 0-3) on process 0 do one send each, addressed to
>       the respective thread on process 1. So, in total, I do 4 sends, thr0 proc0 to thr0
>       proc1, thr1 proc0 to thr1 proc2, etc. I am ensuring the right recipient via
>       different tags for each thread.
>
>       Can someone please comment their experience with communication pattern like that
>       and what could be the cause of the problem I am seeing?
>
>       Thanks,
>       MC
>
>       --
>       Martin Cuma
>       Center for High Performance Computing
>       University of Utah
>       _______________________________________________
>       mvapich-discuss mailing list
>       mvapich-discuss at cse.ohio-state.edu
>       http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 
> 
> 
>

-- 
Martin Cuma
Center for High Performance Computing
University of Utah


More information about the mvapich-discuss mailing list