[mvapich-discuss] multithreaded mpi_get performance

Mon Nov 8 20:49:45 EST 2010

Hi Thiago,

We are in the process of preparing a 1.6 release, and will look into
the GET performance. We are also planning to support fine-grain
multi-threading in a future MVAPICH2 release.

I'm wondering if you are setting MV2_ENABLE_AFFINITY=0? I'm also
wondering if we could get the benchmark you are using? We can run it
in our lab to see if there is anything we can do to improve the
performance you're observing.

Thanks.

On Mon, Nov 8, 2010 at 5:25 PM, Thiago Ize <thiago at sci.utah.edu> wrote:
> Hi,
>
> I've noticed that my multithreaded code does not scale when I call MPI_Get.
>  For instance, I made a test where each thread on node 1 reads from node 0 a
> 8kB chunk of data in a while loop and I measure about 5-6Gbs total whether
> using 1 thread or many threads.  If I use just a single thread but instead
> create multiple processes on node 1 I'll find that I can scale up to about
> 15-16Gbs which corresponds to my maximum IB throughput.  Looking around the
> documentation, my understanding is that currently mvapich2 only supports a
> global mutex so the code effectively remains serial; is this correct?  I
> tried compiling mvapich with --enable-thread-cs=per-object, but that results
> in seg faults, probably because this is not yet supported?
>
> Is there any way that I can get multithreaded code to scale?  If not, is my
> best alternative, sadly, to just write the code using IB verbs?
>
> Thanks,
> Thiago
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>

-- 
Sayantan Sur

Research Scientist
Department of Computer Science
http://www.cse.ohio-state.edu/~surs