Flume Kafka Sink的org.apache.kafka.common.errors.RecordTooLargeException

时间:2017-02-28 11:25:25

标签: apache-kafka flume-ng

我正在尝试从JMS源读取数据并将它们推入KAFKA主题,而在几个小时之后,我发现将频率推向KAFKA主题几乎为零,经过一些初步分析后,我发现FLUME日志中出现以下异常。

28 Feb 2017 16:35:44,758 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:158)  - Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Failed to publish events
        at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:252)
        at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:67)
        at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:145)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.RecordTooLargeException: The message is 1399305 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration.
        at org.apache.kafka.clients.producer.KafkaProducer$FutureFailure.<init>(KafkaProducer.java:686)
        at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:449)
        at org.apache.flume.sink.kafka.KafkaSink.process(KafkaSink.java:212)
        ... 3 more
Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The message is 1399305 bytes when serialized which is larger than the maximum request size you have configured with the max.request.size configuration.

我的水槽显示max.request.size的当前设定值(在日志中)为1048576,显然非常小于13​​99305,增加此max.request.size可能会消除这些异常,但无法找到正确的位置更新该值。

我的flume.config,

a1.sources = r1
a1.channels = c1
a1.sinks = k1

a1.channels.c1.type = file
a1.channels.c1.transactionCapacity = 1000
a1.channels.c1.capacity = 100000000
a1.channels.c1.checkpointDir = /data/flume/apache-flume-1.7.0-bin/checkpoint
a1.channels.c1.dataDirs = /data/flume/apache-flume-1.7.0-bin/data

a1.sources.r1.type = jms

a1.sources.r1.interceptors.i1.type = timestamp
a1.sources.r1.interceptors.i1.preserveExisting = true

a1.sources.r1.channels = c1
a1.sources.r1.initialContextFactory = some context urls
a1.sources.r1.connectionFactory = some_queue
a1.sources.r1.providerURL = some_url 
#a1.sources.r1.providerURL = some_url
a1.sources.r1.destinationType = QUEUE
a1.sources.r1.destinationName = some_queue_name 
a1.sources.r1.userName = some_user
a1.sources.r1.passwordFile= passwd

a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.kafka.topic = some_kafka_topic
a1.sinks.k1.kafka.bootstrap.servers = some_URL
a1.sinks.k1.kafka.producer.acks = 1
a1.sinks.k1.flumeBatchSize = 1
a1.sinks.k1.channel = c1

任何帮助都将非常感谢!!

2 个答案:

答案 0 :(得分:1)

这一变化必须在Kafka完成。 使用更大的值(如

)更新Kafka生成器配置文件producer.properties
max.request.size=10000000

答案 1 :(得分:1)

好像我已经解决了我的问题;由于怀疑增加max.request.size消除了异常,为了更新这样的kafka sink(生产者)属性,FLUME提供常量前缀kafka.producer。我们可以用任何kafka属性附加这个常量前缀;

所以我的作为a1.sinks.k1.kafka.producer.max.request.size = 5271988