[mvapich-discuss] multithreaded mpi_get performance

Thiago Ize thiago at sci.utah.edu
Mon Nov 8 17:25:18 EST 2010


Hi,

I've noticed that my multithreaded code does not scale when I call 
MPI_Get.  For instance, I made a test where each thread on node 1 reads 
from node 0 a 8kB chunk of data in a while loop and I measure about 
5-6Gbs total whether using 1 thread or many threads.  If I use just a 
single thread but instead create multiple processes on node 1 I'll find 
that I can scale up to about 15-16Gbs which corresponds to my maximum IB 
throughput.  Looking around the documentation, my understanding is that 
currently mvapich2 only supports a global mutex so the code effectively 
remains serial; is this correct?  I tried compiling mvapich with 
--enable-thread-cs=per-object, but that results in seg faults, probably 
because this is not yet supported?

Is there any way that I can get multithreaded code to scale?  If not, is 
my best alternative, sadly, to just write the code using IB verbs?

Thanks,
Thiago


More information about the mvapich-discuss mailing list