[mvapich-discuss] proposal to fix MPI_allreduce bandwidth.
Shalnov, Sergey
Sergey.Shalnov at intel.com
Wed Apr 18 07:45:33 EDT 2007
Hello,
I had downloaded fresh version of mvapich-0.9.9 by svn and made several
experiments with collective operations like MPI_Allreduce and
MPI_Allgatherv. I found that bandwidth for MPI_Allreduce has some kind
of hole in case of block size from 16k to 512k transmission. I am not
sure about different architectures but it appears on my Intel based
infiniband clusters (I tested it on two clusters but results are from
one of them).
In attached Microsoft spreadsheet with results and graphs to help you to
examine my results. There are three columns:
1 - mvapich-0.9.9 is the version of mvapich-0.9.9 from tar ball.
2 - mvapich-0.9.9-trunk is the version from main trunk (Dmitri Mishura's
fix
http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2007-April/0007
34.html included)
3 - mvapich-0.9.9-fixed1 is #2 + attached patch.
Attached patch file is the patch for
$MVAPICH_BUILD_HOME/src/coll/intra_fns_new.c:73 file.
This line looks like
#define SHMEM_COLL_ALLREDUCE_THRESHOLD (1<<19)
And I can propose to change it to:
#define SHMEM_COLL_ALLREDUCE_THRESHOLD (1<<15)
This change improves bandwidth on my cluster as showed below:
Size of messages in bytes mvapich-0.9.9 mvapich-0.9.9-fixed1
mvapich-0.9.9-trunk
4 096 89.6467 93.0733
93.2853
8 192 117.332 139.493
137.496
16 384 142.787 184.153
185.812
32 768 158.847 286.147
206.245
65 536 144.555 328.089
192.31
131 072 152.266 289.667
190.743
262 144 166.436 279.73
203.1
524 288 32.6395 253.501
252.428
1 048 576 30.8811 231.03
229.957
2 097 152 27.8332 199.249
201.419
4 194 304 26.4895 191.914
192.835
8 388 608 26.157 183.449
184.363
16 777 216 25.4449 178.985
181.572
33 554 432 25.9249 177.411
179.012
The testing method is to send same amount of bytes (167772160 bytes) on
each iteration by different size of block (size of messages in bytes).
It means each iteration we can measure network bandwidth for particular
message size in MPI collective operation.
Thank you
Sergey
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bug_fix_tests_mvapich-0.9.9_main_trunk.xls
Type: application/vnd.ms-excel
Size: 48640 bytes
Desc: bug_fix_tests_mvapich-0.9.9_main_trunk.xls
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070418/1ba97181/bug_fix_tests_mvapich-0.9.9_main_trunk-0001.xls
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fix2.patch
Type: application/octet-stream
Size: 428 bytes
Desc: fix2.patch
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070418/1ba97181/fix2-0001.obj
More information about the mvapich-discuss
mailing list