[mvapich-discuss] Bandwidth on single hca dual port multirail configuration

Jie Cai Jie.Cai at cs.anu.edu.au
Wed Mar 4 21:16:57 EST 2009


Thanks for the email on the suggestions.

Dhabaleswar Panda wrote:
> I think you posted similar questions on other mailing lists and you got
> some answers. You need to examine multiple things to see what is happening
> on your system.
>
> - What is the speed of your ConnectX card - SDR or DDR?
>   
The HCA I am using is 4x DDR (ConnectX MHGH28-XTC).
> - What is your platform (Intel or AMD)? What is the memory bandwidth
>   available on this platform? Can it support two parallel streams of
>   SDR or DDR IB communication?
>   
I am using Intel Core2 Quad Q6600 CPU and the Intel x38 express chipset.
The memory bandwidth is ~ 5.3 GB/s per channel, and measured bandwidth
using STREAM benchmark is:
-------------------------------------------------------------
Function      Rate (MB/s)   Avg time     Min time     Max time
Copy:        5263.2339       0.0061       0.0061       0.0063
Scale:       5211.7318       0.0061       0.0061       0.0062
Add:         5710.2587       0.0084       0.0084       0.0084
Triad:       5734.0973       0.0084       0.0084       0.0084
-------------------------------------------------------------

Those suggest that the memory bandwidth would be sufficient to drive 2 
DDR ports.
> - Which version of MVAPICH2 you are using? Which interface of MVAPICH2 you
>   are using - OpenFabrics-IB or uDAPL. OpenFabrics-IB interface supports
>   multi-rail option and you should be able to use multiple ports or
>   adapaters. The uDAPL interface only supports single port/adapter.
>   
I have installed MVAPICH2 1.2-p1 with default options on SUSE linux OS.
So OpenFabrics-IB is using.
> - How much performance you get if you use one port? 
The performance using 1 rail is ~1.45 GB/s, which is around the same as 
multi-ports.

> Do the numbers differ
>   when you use one port vs. another port.
>   
No difference between difference ports.

> - You seem to be using OSU Put bandwidth test. This reports bandwidth
>   achieved through MPI one-sided Put operations. Did you try the
>   regular OSU bandwidth test (which shows the performance of
>   two-sided operations)? Do you see any performance difference?
>   
Testing with osu_bandwidth on multirail, observed roughly same
bandwidth (1459.15)
> If you systematically analyze the problem, you should be able to find out
> what is going on.
>
> DK
>   
Matthew: I am not sure the firmware version on HCAs, but will definitely 
check it.

In general, the hardware platform doesn't seem to be the limitation.
I will get back to you once I have got more information.

Thanks a lot.

Best Regards,
Jie
> On Thu, 5 Mar 2009, Jie Cai wrote:
>
>   
>> We have single ConnectX dual port HCA cluster installed, and try to
>> build a dual port multirail IB cluster.
>>
>> I have tested to run OSU put bandwidth test on the cluster with MVAPICH2.
>>
>> mpirun_rsh -ssh -np 2 node02 node01 MV2_NUM_HCAS=1 MV2_NUM_PORTS=2
>> MV2_NUM_QP_PER_PORT=1 ./osu_bandwidth
>>
>> However, I didn't achieve bandwidth improvement. The peak bandwidth I
>> got for the test is 1458.93 MB/s, which is far from the expectation
>> (2.5GB/s).
>>
>> Does anyone knows what's going on?
>>
>> --
>> Jie Cai
>>
>>
>>
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>     
>
>   


More information about the mvapich-discuss mailing list