[mvapich-discuss] Several 2.0-ga collective performance regressions (vs 1.9a2, 1.9b, 2.0a)

Peter Kjellström cap at nsc.liu.se
Wed Jun 25 09:39:03 EDT 2014


Hi there MVAPICH team!

Short summary:

I got around to building the final 2.0 release. I noticed best ever
performance on my non-blocking send/recv tests, yay, but several areas
of performance regressions in the IMB (intel mpi benchmarks, was PMB,
collective performance) results (vs. 1.9a2, 1.9b and 2.0a).


Detailed description:

The attached png shows the difference in performance for my reference
IMB run (128 ranks on 8 full 16-core nodes) between 2.0-ga and 1.9b.
The data is 2.0-ga as compared to 1.9b, that is, green is good for
2.0-ga and red is bad (grey is no difference). Size and brightness is
proportional to the size of the difference:

 grey: within +/- 10%
 color size1: within +/- 50%
 color size2: within +/- 100%
 color size3: within +/- 200%
 bright color size4: more than +/- 200%

The columns are one per IMB test (SR = SendRecv, AG = AllGather, etc.).
The rows are transfer size (first row smallest, last row 1M).

With that background it should be easy to see that there are four large
(more than a few values / transfer sizes) bad areas (bright red, 2.0-ga
worse than +200% of the time it took 1.9b):

 1) AG, AllGather. Increasingly bad but worst at large-ish sizes. Note
 that the three largest sizes are ok (256K, 512K, 1M).

 2&3) G, Gather. Bad at small sizes and at large (but ok in the middle).

 4) AA, AlltoAll. Bad for small sizes

 (and potential 5th would be Bc, Bcast which is bad-ish for everything
 but large).

Feel free to dig into the attached IMB output to discover the real
numbers behind the graphics...

Regards,
 Peter


Background information:

Hardware:
 * dual socket Xeon E5 (2x 8-core) 32G each
 * Mlnx FDR single switch (for this test)

Software:
 * CentOS-6.5
 * Intel compilers 14.0.2
 * RHEL/CentOS IB stack
 * slurm with cgroups (for this test only whole nodes)
 * HT/SMT not enabled

MVAPICH build:
 * configure opts: --enable-hybrid --enable-shared --prefix=...
 * env CC, CXX, FC, F77 set for intel
 * no rdmacm, writeable umad0, limic or other oddities
 * 1.9b rebuilt in exact same env for verification

Job launch:
 * verified correct rank pinning and launch
 * launch cmd: "mpiexec.hydra -bootstrap slurm IMB..."
 * 1.9b and 2.0-ga run on same node-set
 * geometry: 128 ranks on 8 nodes

-------------- next part --------------
A non-text attachment was scrubbed...
Name: mvp20ga_vs_mvp19b_imb_at_128r.png
Type: image/png
Size: 2467 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140625/c08cc3b2/attachment-0001.png>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mvp19b_imb_128r.txt
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140625/c08cc3b2/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: mvp20ga_imb_128r.txt
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140625/c08cc3b2/attachment-0003.txt>


More information about the mvapich-discuss mailing list