Kafka Spout在读取某些消息

时间:2018-01-16 08:54:52

标签: apache-kafka apache-storm kafka-consumer-api

我遇到了Apache Storm和Kafka的问题。 KafkaSpout正常地从Kafka读取消息,但是在大约30,000条消息之后,失败的元组开始出现,Bolt没有收到任何消息。

我检查了worker.log并看到,当拓扑开始时,它尝试从Zookeeper读取分区信息,然后在代理和成功中读取,如下所示:offset 9539

Read partition information from: /twitter_streaming_tweet_test/STREAMING_TWEET_WRITER_SPOUT/partition_2  --> {"partition":2,"offset":9539,"topology":{"name":"DATA_WRITER_TOPOLOGY","id":"DATA_WRITER_TOPOLOGY-67-1516077955"},"topic":"twitter_streaming_tweet_test","broker":{"port":9092,"host":"zoo1"}}

2018-01-16 17:05:57.510 o.a.s.k.PartitionManager Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]从zookeeper读取最后一次提交偏移量:9539; old topology_id:DATA_WRITER_TOPOLOGY-67-1516077955 - new topology_id:DATA_WRITER_TOPOLOGY-68-1516089922 2018-01-16 17:05:57.514 o.a.s.k.PartitionManager Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]从偏移9539开始Kafka zoo1分区{host = zoo1:9092,topic = twitter_streaming_tweet_test,partition = 2} 2018-01-16 17:05:57.518 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]完成刷新

然后拓扑正常运行,直到大约30,000条消息

2018-01-16 17:06:39.732 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor[3 3] [INFO] Tweet ID 952850493570654209 was saved to database

2018-01-16 17:06:39.739 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor [6 6] [INFO] Tweet ID 952850099335348224已保存到数据库 2018-01-16 17:06:39.742 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor [3 3] [INFO] Tweet ID 952850787981393920已保存到数据库 2018-01-16 17:06:39.753 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor [3 3] [INFO] Tweet ID 952850152573685760已保存到数据库 2018-01-16 17:06:39.754 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor [6 6] [INFO] Tweet ID 952850099578654721已保存到数据库 2018-01-16 17:06:39.763 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor [3 3] [INFO] Tweet ID 952850153173524481已保存到数据库 2018-01-16 17:06:39.768 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor [6 6] [INFO] Tweet ID 952850099989704705已保存到数据库 2018-01-16 17:06:39.776 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor [3 3] [INFO] Tweet ID 952850153232154624已保存到数据库 2018-01-16 17:06:39.779 TWLogger Thread-9-STREAMING_TWEET_WRITER_BOLT-executor [6 6] [INFO] Tweet ID 952850758289956864已保存到数据库 2018-01-16 17:06:39.787 TWLogger Thread-7-STREAMING_TWEET_WRITER_BOLT-executor [3 3] [INFO] Tweet ID 952850154436018176已保存到数据库 2018-01-16 17:07:56.106 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]刷新分区管理器连接 2018-01-16 17:07:56.117 oaskDynamicBrokersReader Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]从zookeeper读取分区信息:GlobalPartitionInformation {topic = twitter_streaming_tweet_test,partitionMap = {0 = zoo2:9092,1 = zoo3 :9092,2 = = zoo1:9092}} 2018-01-16 17:07:56.117 oaskKafkaUtils Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]已分配[Partition {host = zoo1:9092,topic = twitter_streaming_tweet_test,partition = 2} ] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]删除分区管理器:[] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]新的分区管理员:[] 2018-01-16 17:07:56.117 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]完成刷新 2018-01-16 17:09:54.150 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]刷新分区管理器连接 2018-01-16 17:09:54.160 oaskDynamicBrokersReader Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]从zookeeper读取分区信息:GlobalPartitionInformation {topic = twitter_streaming_tweet_test,partitionMap = {0 = zoo2:9092,1 = zoo3 :9092,2 = = zoo1:9092}} 2018-01-16 17:09:54.160 oaskKafkaUtils Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]已分配[Partition {host = zoo1:9092,topic = twitter_streaming_tweet_test,partition = 2} ] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]删除分区管理器:[] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]新的分区管理员:[] 2018-01-16 17:09:54.160 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]完成刷新 2018-01-16 17:10:56.108 o.a.s.k.ZkCoordinator Thread-11-STREAMING_TWEET_WRITER_SPOUT-executor [9 9] [INFO]任务[3/3]刷新分区管理器连接

推文正常保存,然后Kafka Spout尝试从Zookeeper读取分区信息,找不到任何内容,因此没有处理元组,拓扑卡住了。任何人都可以帮我解决这个问题。非常感谢你。

1 个答案:

答案 0 :(得分:0)

您可以检查最大出水口等待值吗?通常,如果将其设置为非常高的值,则最终,失败的元组将在一段时间后出现在风暴统计中,因为如果max.spout.pending非常高,则消息将超时。如果您可以输入喷口/螺栓的风暴统计数据以及max.spout.pending值,则将有助于您了解此问题。