我试图从Twitter API收集推文,并通过Flume将它们写入Kafka。我用的是卡夫卡水槽。问题是Kafka Sink没有收集Flume收集的所有推文。我运行了zookeeper和kafka服务器,创建了主题twitter并与消费者一起听了主题。例如,对于Flume收集的1000条推文,Kafka在处理1分钟后仅显示100条。 这是flume conf文件:
TwitterAgent.sources = Twitter
TwitterAgent.channels= MemChannel
TwitterAgent.sinks = kafka
TwitterAgent.sources.Twitter.type = org.apache.flume.source.twitter.TwitterSource
TwitterAgent.sources.Twitter.consumerKey =
TwitterAgent.sources.Twitter.consumerSecret =
TwitterAgent.sources.Twitter.accessToken =
TwitterAgent.sources.Twitter.accessTokenSecret =
TwitterAgent.sources.Twitter.channels = MemChannel
TwitterAgent.sinks.kafka.channel = MemChannel
TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 10020
TwitterAgent.channels.MemChannel.transactionCapacity = 1300
TwitterAgent.sinks.kafka.type = org.apache.flume.sink.kafka.KafkaSink
TwitterAgent.sinks.kafka.topic = twitter
TwitterAgent.sinks.kafka.brokerList = localhost:9092
TwitterAgent.sinks.kafka.batchsize = 100
TwitterAgent.sinks.kafka.request.required.acks = -1
感谢您的帮助