[mvapich-discuss] Re: Question on bandwidth test
Abhinav Vishnu
vishnu at cse.ohio-state.edu
Tue Jun 5 09:53:19 EDT 2007
Hi Wenli,
Thanks for using MVAPICH and reporting the performance issue to us.
IMHO, this is not a problem of the MPI layer, but the performance
degradation should be visible on the tests at the verbs layer too.
I am assuming that you are using OFED-1.1 and HCA firmware version 3.3.3
or greater.
To see whether this is the case, may i request you to do the following:
Say you want to run the tests on inode28 and inode30
1. On inode28:
% ib_rdma_bw -s1048576 -n100
2. On inode30:
% ib_rdma_bw -s1048576 -n100 inode28
I feel that you should see a similar performance degradation, as you are
seeing at the MPI layer.
My answers with respect to Multi-Rail Paper are inline, please scroll
down.
>
>
> My system is:
> -- 2.2GHz Dual Core AMD Opteron(tm) Processor 275, 8GB Mem
> -- Linux 2.6.9-42.ELsmp x86_64
> -- openib-1.1 { Detected the following HCAs: 1) mthca0 [ Mellanox PCI-X ] }
>
> 1. Test inter-node bandwidth with -DVIADEV_RGET_SUPPORT .
> setup_ch_gen2 starts... -D_X86_64_ -DEARLY_SEND_COMPLETION -DMEMORY_SCALE
> -DVIADEV_RGET_SUPPORT -DLAZY_MEM_UNREGISTER -DCH_GEN2 -D_SMP_ -D_SMP_RNDV_
> -D_MLX_PCI_X_ -I/usr/local/ofed/include -O3
>
> $ mpirun_rsh -rsh -np 2 inode28 inode30 ./osu_bw
> # OSU MPI Bandwidth Test (Version 2.2)
> # Size Bandwidth (MB/s)
> 1 0.243180
> 2 0.507795
> 4 1.008787
> 8 2.030054
> 16 4.008455
> 32 8.113140
> 64 16.160978
> 128 33.764735
> 256 67.708075
> 512 161.522157
> 1024 335.222506
> 2048 491.421716
> 4096 568.259955
> 8192 606.043232
> 16384 662.063392
> 32768 738.589843
> 65536 783.586601
> 131072 807.462616
> 262144 820.750931
> 524288 685.880335
> 1048576 660.237959
> 2097152 659.233480
> 4194304 659.946110
> 2. Test inter-node bandwidth with -DVIADEV_RPUT_SUPPORT .
> setup_ch_gen2 starts... -D_X86_64_ -DEARLY_SEND_COMPLETION -DMEMORY_SCALE
> -DVIADEV_RPUT_SUPPORT -DLAZY_MEM_UNREGISTER -DCH_GEN2 -D_SMP_ -D_SMP_RNDV_
> -D_MLX_PCI_X_ -I/usr/local/ofed/include -O3
>
> $ mpirun_rsh -rsh -np 2 inode28 inode30 ./osu_bw
> # OSU MPI Bandwidth Test (Version 2.2)
> # Size Bandwidth (MB/s)
> 1 0.248081
> 2 0.516046
> 4 1.034260
> 8 2.069607
> 16 4.110799
> 32 8.282444
> 64 16.593745
> 128 34.620911
> 256 69.113305
> 512 163.455879
> 1024 341.066875
> 2048 496.503655
> 4096 569.049428
> 8192 606.183374
> 16384 624.840449
> 32768 713.280615
> 65536 769.011487
> 131072 800.359506
> 262144 814.869019
> 524288 679.025085
> 1048576 652.137840
> 2097152 650.207077
> 4194304 650.629356
> 3. Test intra-node bandwidth with -DVIADEV_RPUT_SUPPORT .
> $ mpirun_rsh -rsh -np 2 inode28 inode28 ./osu_bw
> # OSU MPI Bandwidth Test (Version 2.2)
> # Size Bandwidth (MB/s)
> 1 2.173175
> 2 4.449079
> 4 9.049134
> 8 20.301348
> 16 42.489627
> 32 85.085168
> 64 153.869271
> 128 286.734337
> 256 480.187573
> 512 741.525232
> 1024 932.896797
> 2048 1145.834426
> 4096 1291.731546
> 8192 1388.989562
> 16384 1428.285773
> 32768 1453.529249
> 65536 1431.307671
> 131072 1445.227803
> 262144 1393.404399
> 524288 1168.315567
> 1048576 1071.952093
> 2097152 1072.327638
> 4194304 1064.196619
>
> I have seen test results on your homepage (http://mvapich.cse.ohio-state.edu/
> performance/mvapich/opteron/MVAPICH-opteron-gen2-DDR.shtml, http://
> mvapich.cse.ohio-state.edu/performance/mvapich/intra_opteron.shtml), that
> inter-node bandwidth results seem normal but intra-node bandwidth results are
> like mine. And bandwidth results in your paper BUILDING MULTIRAIL INFINIBAND
> CLUSTERS: MPI-LEVEL DESIGN AND PERFORMANCE EVALUATION: SC2004(Fig. 9) seem that
> striping or binding optimization will remove improve this problem.
Yes, actually striping the data on multiple paths helps the performance
of microbenchmarks and applications as shown in the paper. However, as
per your system, you are using only one HCA and one port for
communication. Hence, these scheduling policies are unlikely to solve
the situation.
I think we will have a clearer idea about the point of performance
degradation, once you have the results from ib_rdma_bw. Please let us
know the outcome of your experimentation.
Thanks,
:- Abhinav
>
> What do you think will be the problem source for my bandwidth tests? In order
> to get optimal bandwidth value, what do you think I should modify based on
> default options in original MVAPICH 0.9.8 packet?
>
>
> Any reply is appreciated!
>
> Thanks,
> Wenli
>
More information about the mvapich-discuss
mailing list