Apache Flume:kafka.consumer.ConsumerTimeoutException

时间:2016-02-26 15:44:19

标签: apache-kafka flume flume-ng

我正在尝试使用Apache Flume构建管道: spooldir - > kafka频道 - > hdfs sink

事件转到kafka主题没有问题,我可以用kafkacat请求看到它们。但是kafka通道无法通过接收器将文件写入hdfs。错误是:

在等待来自Kafka的数据时超时

完整日志:

  

2016-02-26 18:25:17,125   (SinkRunner-PollingRunner-DefaultSinkProcessor-SendThread(zoo02:2181))   [调试 -   org.apache.zookeeper.ClientCnxn $ SendThread.readResponse(ClientCnxn.java:717)]   对于sessionid得到ping响应:0ms后的0x2524a81676d02aa

     

2016-02-26 18:25:19,127   (SinkRunner-PollingRunner-DefaultSinkProcessor-SendThread(zoo02:2181))   [调试 -   org.apache.zookeeper.ClientCnxn $ SendThread.readResponse(ClientCnxn.java:717)]   对于sessionid获得ping响应:1ms后获得0x2524a81676d02aa

     

2016-02-26 18:25:21,129   (SinkRunner-PollingRunner-DefaultSinkProcessor-SendThread(zoo02:2181))   [调试 -   org.apache.zookeeper.ClientCnxn $ SendThread.readResponse(ClientCnxn.java:717)]   对于sessionid得到ping响应:0ms后的0x2524a81676d02aa

     

2016-02-26 18:25:21,775   (SinkRunner-PollingRunner-DefaultSinkProcessor)[DEBUG -   org.apache.flume.channel.kafka.KafkaChannel $ KafkaTransaction.doTake(KafkaChannel.java:327)]   在等待来自Kafka的数据时超时   kafka.consumer.ConsumerTimeoutException at   kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:69)     在   kafka.consumer.ConsumerIterator.makeNext(ConsumerIterator.scala:33)     在   kafka.utils.IteratorTemplate.maybeComputeNext(IteratorTemplate.scala:66)     at kafka.utils.IteratorTemplate.hasNext(IteratorTemplate.scala:58)     在   org.apache.flume.channel.kafka.KafkaChannel $ KafkaTransaction.doTake(KafkaChannel.java:306)     在   org.apache.flume.channel.BasicTransactionSemantics.take(BasicTransactionSemantics.java:113)     在   org.apache.flume.channel.BasicChannelSemantics.take(BasicChannelSemantics.java:95)     在   org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:374)     在   org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)     在org.apache.flume.SinkRunner $ PollingRunner.run(SinkRunner.java:147)     在java.lang.Thread.run(Thread.java:745)

我的FlUME的配置是:

# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c2

# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /home/alex/spoolFlume

a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path =  hdfs://10.12.0.1:54310/logs/flumetest/
a1.sinks.k1.hdfs.filePrefix = flume-
a1.sinks.k1.hdfs.round = true
a1.sinks.k1.hdfs.roundValue = 10
a1.sinks.k1.hdfs.roundUnit = minute
a1.sinks.k1.hdfs.fileType = DataStream
a1.sinks.k1.hdfs.writeFormat = Text

a1.channels.c2.type   = org.apache.flume.channel.kafka.KafkaChannel
a1.channels.c2.capacity = 10000
a1.channels.c2.transactionCapacity = 1000
a1.channels.c2.brokerList=kafka10:9092,kafka11:9092,kafka12:9092
a1.channels.c2.topic=flume_test_001
a1.channels.c2.zookeeperConnect=zoo00:2181,zoo01:2181,zoo02:2181

# Bind the source and sink to the channel
a1.sources.r1.channels = c2
a1.sinks.k1.channel = c2

使用内存频道代替kafka频道一切正常。

提前感谢任何想法!

3 个答案:

答案 0 :(得分:0)

ConsumerTimeoutException意味着很长一段时间没有新消息,并不代表Kafka的连接超时。

http://kafka.apache.org/documentation.html

consumer.timeout.ms -1如果在指定的时间间隔后没有消息可供消费,则向消费者抛出超时异常

答案 1 :(得分:0)

Kafka的ConsumerConfig类具有“consumer.timeout.ms”配置属性,Kafka默认将其设置为-1。预计任何新的卡夫卡消费者都会以合适的价格覆盖该物业。

以下是Kafka documentation的参考:

consumer.timeout.ms     -1  
By default, this value is -1 and a consumer blocks indefinitely if no new message is available for consumption. By setting the value to a positive integer, a timeout exception is thrown to the consumer if no message is available for consumption after the specified timeout value.

当Flume创建Kafka频道时,它将timeout.ms值设置为100,如INFO级别的Flume日志所示。这就解释了为什么我们会看到大量的这些ConsumerTimeoutExceptions。

 level: INFO Post-validation flume configuration contains configuration for agents: [agent]
 level: INFO Creating channels
 level: DEBUG Channel type org.apache.flume.channel.kafka.KafkaChannel is a custom type
 level: INFO Creating instance of channel c1 type org.apache.flume.channel.kafka.KafkaChannel
 level: DEBUG Channel type org.apache.flume.channel.kafka.KafkaChannel is a custom type
 level: INFO Group ID was not specified. Using flume as the group id.
 level: INFO {metadata.broker.list=kafka:9092, request.required.acks=-1, group.id=flume, 
              zookeeper.connect=zookeeper:2181, **consumer.timeout.ms=100**, auto.commit.enable=false}
 level: INFO Created channel c1

通过Kafka channel settings上的Flume用户指南,我尝试通过指定以下内容来覆盖此值,但这似乎不起作用:

agent.channels.c1.kafka.consumer.timeout.ms=5000

此外,我们通过频道不断地对敲击数据进行了负载测试,并且在测试期间没有发生此异常。

答案 2 :(得分:0)

我读了flume的源代码,发现flume读取了“consumer.timeout.ms”键的“超时”值。

所以你可以像这样配置“consumer.timeout.ms”的值:

agent1.channels.kafka_channel.timeout=-1