[Hadoop-RDMA-discuss] RDMA for Apache Kafka 0.9.1

Lu, Xiaoyi lu.932 at osu.edu
Wed Apr 3 13:34:39 EDT 2019


This is just for closing this thread.

For the issue reported by the user, it can be fixed by the following two configs.

1. Add “--producer.config config/producer.properties” to the command.

2. Add “rdma.dev.name=mlx4_0” into the *.properties files.

Regarding the performance issue, after retrying with the producer test, the user was able to get obvious performance improvements with RDMA as below.

rdma:
[root at test rdma-kafka-0.9.1]# bin/kafka-producer-perf-test.sh --topic test --num-records 500000 --throughput -1 --record-size 100  --producer.config 
config/producer.properties 
500000 records sent, 133976.420150 records/sec (12.78 MB/sec), 1345.57 ms avg latency, 1935.00 ms max latency, 1492 ms 50th, 1874 ms 95th, 1924 ms 99th, 1935 ms 
99.9th.

ipoib:
[root at test rdma-kafka-0.9.1]# bin/kafka-producer-perf-test.sh --topic test --num-records 500000 --throughput -1 --record-size 100 --producer.config 
config/producer.properties 
365938 records sent, 73187.6 records/sec (6.98 MB/sec), 2068.9 ms avg latency, 3682.0 max latency.
500000 records sent, 77267.810230 records/sec (7.37 MB/sec), 2443.21 ms avg latency, 3682.00 ms max latency, 2870 ms 50th, 3657 ms 95th, 3675 ms 99th, 3681 ms 99.9th.

The consumer tests should be due to some configuration issues as well. We are following up with the user’s tests. 

In summary, the above-mentioned two configs can fix the reported issue.

Xiaoyi

> On Apr 2, 2019, at 5:01 AM, 13813995851 at 139.com wrote:
> 
> hi, I tested the ipoib and rdma modes, why the ipoib mode performs better?
>  
> bin/kafka-consumer-perf-test.sh --topic test --messages 300000 --broker-list 10.10.10.34:9092  --consumer.config 
> config/consumer.properties
> project	start.time	 end.time	 data.consumed.in.MB	 MB.sec	 data.consumed.in.nMsg	 nMsg.sec	 rebalance.time.ms	 fetch.time.ms	 fetch.MB.sec	 fetch.nMsg.sec
> consumer client(rdma/500w messages)	2019-04-02 16:35:02:837	 2019-04-02 16:35:03:960	47.7033	42.4784	500205	445418.5218	18	1105	43.1704	452674.2081
> consumer client(ipoib/500w messages)	2019-04-02 16:14:48:369	 2019-04-02 16:14:49:353	47.7033	48.4789	500205	508338.4146	19	965	49.4334	518347.1503
> 
> 13813995851 at 139.com <mailto:13813995851 at 139.com>
> From: 13813995851 at 139.com <mailto:13813995851 at 139.com>
> Date: 2019-04-02 14:43
> To: Lu, Xiaoyi <mailto:lu.932 at osu.edu>
> CC: rdma-hadoop-discuss <mailto:rdma-hadoop-discuss at cse.ohio-state.edu>
> Subject: Re: Re: [Hadoop-RDMA-discuss] RDMA for Apache Kafka 0.9.1
> it's ok,thank you very much!
> 
> 13813995851 at 139.com
>  
> From: Lu, Xiaoyi <mailto:lu.932 at osu.edu>
> Date: 2019-04-02 04:42
> To: 13813995851 at 139.com <mailto:13813995851 at 139.com>
> CC: rdma-hadoop-discuss <mailto:rdma-hadoop-discuss at cse.ohio-state.edu>
> Subject: Re: [Hadoop-RDMA-discuss] RDMA for Apache Kafka 0.9.1
> Did you use the  —producer.config flag?
> 
> bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test --producer.config config/producer.properties
> 
> BTW, you can change the dev.num back to 0.
> 
> Xiaoyi
> 
>> On Mar 31, 2019, at 10:28 PM, 13813995851 at 139.com <mailto:13813995851 at 139.com> wrote:
>> 
>> hi,
>> 
>> 13813995851 at 139.com <mailto:13813995851 at 139.com>
>>  
>> From: Lu, Xiaoyi <mailto:lu.932 at osu.edu>
>> Date: 2019-04-01 09:49
>> To: 13813995851 at 139.com <mailto:13813995851 at 139.com>
>> CC: rdma-hadoop-discuss <mailto:rdma-hadoop-discuss at cse.ohio-state.edu>
>> Subject: Re: [Hadoop-RDMA-discuss] RDMA for Apache Kafka 0.9.1
>> Are you able to run some basic IB-level tests on your cluster?
>> 
>> 
>>> On Mar 31, 2019, at 9:46 PM, 13813995851 at 139.com <mailto:13813995851 at 139.com> wrote:
>>> 
>>> I have change the device.num to 1 ,but it  still can't find hca card,
>>> [root at test rdma-kafka-0.9.1]# bin/kafka-topics.sh --list --zookeeper localhost:2181 
>>> test
>>> [root at test rdma-kafka-0.9.1]# bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
>>> Cannot find valid HCA
>>> at line 319 in file ucr_init.c
>>> 
>>> 13813995851 at 139.com <mailto:13813995851 at 139.com>
>>>  
>>> From: Xiaoyi Lu <mailto:lu.932 at osu.edu>
>>> Date: 2019-03-29 23:56
>>> To: 13813995851 at 139.com <mailto:13813995851 at 139.com>
>>> CC: rdma-hadoop-discuss <mailto:rdma-hadoop-discuss at cse.ohio-state.edu>
>>> Subject: Re: [Hadoop-RDMA-discuss] RDMA for Apache Kafka 0.9.1
>>> Can you change the device.num to 1 and recheck?
>>> 
>>> Xiaoyi
>>> 
>>> Sent from my iPhone
>>> 
>>> On Mar 28, 2019, at 10:21 PM, "13813995851 at 139.com <mailto:13813995851 at 139.com>" <13813995851 at 139.com <mailto:13813995851 at 139.com>> wrote:
>>> 
>>>> Hi, Xiaoyi,
>>>> 
>>>> thank you for your reply!
>>>> I have set the corresponding config values in the producer.properties, server.properties and consumer.properties files.
>>>> but  the HCA card still can't be found?Is there any problem with the attachment log?
>>>> 
>>>> Best regard!
>>>> 13813995851 at 139.com <mailto:13813995851 at 139.com>
>>>>  
>>>> From: Lu, Xiaoyi <mailto:lu.932 at osu.edu>
>>>> Date: 2019-03-29 07:07
>>>> To: 13813995851 at 139.com <mailto:13813995851 at 139.com>
>>>> CC: rdma-hadoop-discuss <mailto:rdma-hadoop-discuss at cse.ohio-state.edu>
>>>> Subject: Re: [Hadoop-RDMA-discuss] RDMA for Apache Kafka 0.9.1
>>>> Hi,
>>>>  
>>>> If you have read our user guide, in Section "4.2 Advanced Configuration”, we have mentioned the following configuration parameters:
>>>>  
>>>> rdma.dev.name=mlx5_0
>>>>  
>>>> rdma.dev.num=0
>>>>  
>>>> You need to set the corresponding config values in the producer.properties, server.properties and consumer.properties files.
>>>>  
>>>> Xiaoyi
>>>>  
>>>> > On Mar 28, 2019, at 4:14 AM, 13813995851 at 139.com <mailto:13813995851 at 139.com> wrote:
>>>> >
>>>> > hello, I encountered a problem when I installed RDMA for Apache Kafka,
>>>> > Can you tell me how to solve it?thanks
>>>> > [root at test rdma-kafka-0.9.1]# ibstat
>>>> > CA 'mlx4_0'
>>>> >         CA type: MT4099
>>>> >         Number of ports: 1
>>>> >         Firmware version: 2.42.5000
>>>> >         Hardware version: 1
>>>> >         Node GUID: 0xf45214030083ede0
>>>> >         System image GUID: 0xf45214030083ede3
>>>> >         Port 1:
>>>> >                 State: Active
>>>> >                 Physical state: LinkUp
>>>> >                 Rate: 40 (FDR10)
>>>> >                 Base lid: 17
>>>> >                 LMC: 0
>>>> >                 SM lid: 6
>>>> >                 Capability mask: 0x02514868
>>>> >                 Port GUID: 0xf45214030083ede1
>>>> >                 Link layer: InfiniBand
>>>> >
>>>> > [root at test rdma-kafka-0.9.1]# bin/kafka-topics.sh --list --zookeeper localhost:2181
>>>> > test
>>>> > [root at test rdma-kafka-0.9.1]# bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
>>>> > Cannot find valid HCA
>>>> >  at line 319 in file ucr_init.c
>>>> >
>>>> > Best regards!
>>>> > 13813995851 at 139.com <mailto:13813995851 at 139.com>
>>>> > _______________________________________________
>>>> > RDMA-Hadoop-discuss mailing list
>>>> > RDMA-Hadoop-discuss at cse.ohio-state.edu <mailto:RDMA-Hadoop-discuss at cse.ohio-state.edu>
>>>> > http://mailman.cse.ohio-state.edu/mailman/listinfo/rdma-hadoop-discuss <http://mailman.cse.ohio-state.edu/mailman/listinfo/rdma-hadoop-discuss>
>>>>  
>>>> <kafka.txt>
>> 
>> <ib_test>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/rdma-hadoop-discuss/attachments/20190403/16da8ae7/attachment-0001.html>


More information about the RDMA-Hadoop-discuss mailing list