我正在尝试将数据从我的kafka主题中提取出来并将其写入HDFS,并且看起来像我在几个示例中看到的一样,但我似乎无法解决以下错误。我可以通过python从主题中进行消费,所以我知道我可以。我正在使用Flume版本1.6.0和Java 9.0.1。使它不接受kafka主题我在做什么错了?
09 Jul 2018 17:17:26,973 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:145) -Creating channels
09 Jul 2018 17:17:26,984 INFO [conf-file-poller-0] (org.apache.flume.channel.DefaultChannelFactory.create:42) - Creating instance of channel kafka_hdfs_channel type memory
09 Jul 2018 17:17:26,989 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:200) - Created channel kafka_hdfs_channel
09 Jul 2018 17:17:26,989 INFO [conf-file-poller-0] (org.apache.flume.source.DefaultSourceFactory.create:41) - Creating instance of source kafka_source, type org.apache.flume.source.kafka.KafkaSource
09 Jul 2018 17:17:26,993 ERROR [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadSources:361) - Source kafka_source has been removed due to an error during configuration
org.apache.flume.conf.ConfigurationException: Kafka topic must be specified.
at org.apache.flume.source.kafka.KafkaSource.configure(KafkaSource.java:180)
at org.apache.flume.conf.Configurables.configure(Configurables.java:41)
at org.apache.flume.node.AbstractConfigurationProvider.loadSources(AbstractConfigurationProvider.java:326)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:97)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:514)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:300)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
at java.base/java.lang.Thread.run(Thread.java:844)}
这是我的水槽配置:
agentCDIS.sources = kafka_source
agentCDIS.channels = kafka_hdfs_channel
agentCDIS.sinks = hdfs_sink
agentCDIS.sources.kafka_source.type = org.apache.flume.source.kafka.KafkaSource
agentCDIS.sources.kafka_source.kafka.bootstrap.servers = 10.4.3.61:9092, 10.4.3.62:9092, 10.4.3.63:9092
agentCDIS.sources.kafka_source.kafka.topic = test
agentCDIS.sources.kafka_source.kafka.consumer.group.id = cn_flume_group
agentCDIS.sources.kafka_source.channels = kafka_hdfs_channel
agentCDIS.sources.kafka_source.interceptors = i1
agentCDIS.sources.kafka_source.interceptors.i1.type = timestamp
agentCDIS.sources.kafka_source.kafka.consumer.timeout.ms = 1000
agentCDIS.channels.kafka_hdfs_channel.type = memory
agentCDIS.channels.kafka_hdfs_channel.capacity = 10000
agentCDIS.channels.kafka_hdfs_channel.transactionCapacity = 1000
agentCDIS.sinks.hdfs_sink.type = hdfs
agentCDIS.sinks.hdfs_sink.hdfs.path = hdfs://10.4.16.16:8020/user/cnelson/kafka/%{topic}/%y-%m-%d
agentCDIS.sinks.hdfs_sink.hdfs.rollInterval = 5
agentCDIS.sinks.hdfs_sink.hdfs.rollSize = 0
agentCDIS.sinks.hdfs_sink.fileType = DataStream
agentCDIS.sinks.hdfs_sink.channel = kafka_hdfs_channel
agentCDIS.sinks.loggerSink.type = logger
agentCDIS.sinks.loggerSink.kafka_hdfs_channel = memoryChannel
agentCDIS.channels.memoryChannel.type = memory
agentCDIS.channels.memoryChannel.capacity = 100
答案 0 :(得分:0)
我浏览了几次发布和配置,并注意到-您已经提到您正在使用Flume的1.6版,并且根据documentation,其属性略有不同。您可以尝试以下方法吗?
agentCDIS.sources.kafka_source.kafka.bootstrap.servers
=>试试agentCDIS.sources.kafka_source.zookeeperConnect
-此属性的值将是您的Kafka集群使用的zookeeper URI。agentCDIS.sources.kafka_source.kafka.topic = test
=>试试agentCDIS.sources.kafka_source.topic = test
agentCDIS.sources.kafka_source.kafka.consumer.group.id = cn_flume_group
=>试试agentCDIS.sources.kafka_source.groupId = cn_flume_group
您在配置文件中使用的上述3个属性是从version 1.7中引入的。
我希望这会有所帮助!