[mvapich-discuss] performance problems with gath/scat

Dhabaleswar Panda panda at cse.ohio-state.edu
Fri Jul 23 16:57:27 EDT 2010


Hi Max,

Thanks for your note.

>   We are having serious performance problems
> with collectives when using several hundred cores
> on the Discover system at NASA Goddard.

Could you please let us know some more details on the performance problems
you are observing - which collectives, what data sizes, what system sizes,
etc.?

> I noticed some fixes were made to collectives in 1.5.
> Would these help with scat/gath?

In 1.5, in addition to some fixes in collectives, several thresholds were
changed for point-to-point operations (based on platform and adapter
characteristics) to obtain better performance. These changes will also
have positive impact on the performance of collectives.

Thus, I will suggest you to upgrade to 1.5 first. If the performance
issues for collectives still remain, we will be happy to debug this issue
further.

> I noticed a couple of months ago someone reporting
> very poor performance in global sums:
>
> http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2010-June/002876.html
>
> But the thread ends unresolved.

Since the 1.5 release procedure was getting overlapped with the
examination of this issue, we got context-switched. We will take a closer
look at this issue with 1.5 version.

> Has anyone else had these problems?

Thanks,

DK



More information about the mvapich-discuss mailing list