[mvapich-discuss] multithreaded mpi_get performance
Thiago Ize
thiago at sci.utah.edu
Mon Nov 8 17:25:18 EST 2010
Hi,
I've noticed that my multithreaded code does not scale when I call
MPI_Get. For instance, I made a test where each thread on node 1 reads
from node 0 a 8kB chunk of data in a while loop and I measure about
5-6Gbs total whether using 1 thread or many threads. If I use just a
single thread but instead create multiple processes on node 1 I'll find
that I can scale up to about 15-16Gbs which corresponds to my maximum IB
throughput. Looking around the documentation, my understanding is that
currently mvapich2 only supports a global mutex so the code effectively
remains serial; is this correct? I tried compiling mvapich with
--enable-thread-cs=per-object, but that results in seg faults, probably
because this is not yet supported?
Is there any way that I can get multithreaded code to scale? If not, is
my best alternative, sadly, to just write the code using IB verbs?
Thanks,
Thiago
More information about the mvapich-discuss
mailing list