我正在尝试使用以下代码从spark spark连接到kafka以进行小型POC。
这就是我开始Kafka的方式
bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic test
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic test
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic test --from-beginning
这是我的Spark流媒体代码,用于接收消息并在控制台上打印。
object ReadingFromKafkaSource extends App {
Logger.getLogger("org").setLevel(Level.ERROR)
val conf = new SparkConf()
.setMaster("local[*]")
.setAppName("test")
val streamingContext = new StreamingContext(conf, Seconds(20))
val lines = KafkaUtils.createStream(streamingContext, "localhost:9092", "spark-streaming-configuration-group", Map("test" -> 1))
lines.print()
streamingContext.start()
streamingContext.awaitTermination()
}
我收到以下错误消息。
4:45:26.002 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326000
14:45:26.204 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326200
14:45:26.405 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326400
14:45:26.601 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326600
14:45:26.801 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952326800
14:45:27.000 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327000
14:45:27.201 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327200
14:45:27.244 [Executor task launch worker for task 99] DEBUG org.apache.zookeeper.ZooKeeper - Closing session: 0x0
14:45:27.244 [Executor task launch worker for task 99] DEBUG org.apache.zookeeper.ClientCnxn - Closing client for session: 0x0
14:45:27.401 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327400
14:45:27.600 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327600
14:45:27.742 [Executor task launch worker for task 99-SendThread(localhost:9092)] DEBUG org.apache.zookeeper.ClientCnxn - An exception was thrown while closing send thread for session 0x0 : Client session timed out, have not heard from server in 3005ms for sessionid 0x0
14:45:27.801 [RecurringTimer - BlockGenerator] DEBUG org.apache.spark.streaming.util.RecurringTimer - Callback for BlockGenerator called at time 1520952327800
14:45:27.844 [Executor task launch worker for task 99] DEBUG org.apache.zookeeper.ClientCnxn - Disconnecting client for session: 0x0
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.zookeeper.ZooKeeper - Session: 0x0 closed
14:45:27.844 [Executor task launch worker for task 99-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread shut down
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Stopping receiver with message: Error starting receiver 0: org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Called receiver onStop
14:45:27.844 [Executor task launch worker for task 99] INFO org.apache.spark.streaming.receiver.ReceiverSupervisorImpl - Deregistering receiver 0
14:45:27.845 [dispatcher-event-loop-1] ERROR org.apache.spark.streaming.scheduler.ReceiverTracker - Deregistered receiver for stream 0: Error starting receiver 0 - org.I0Itec.zkclient.exception.ZkTimeoutException: Unable to connect to zookeeper server within timeout: 10000
at org.I0Itec.zkclient.ZkClient.connect(ZkClient.java:1232)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:156)
at org.I0Itec.zkclient.ZkClient.<init>(ZkClient.java:130)
at kafka.utils.ZkUtils$.createZkClientAndConnection(ZkUtils.scala:75)
at kafka.utils.ZkUtils$.apply(ZkUtils.scala:57)
at kafka.consumer.ZookeeperConsumerConnector.connectZk(ZookeeperConsumerConnector.scala:191)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:139)
at kafka.consumer.ZookeeperConsumerConnector.<init>(ZookeeperConsumerConnector.scala:156)
at kafka.consumer.Consumer$.create(ConsumerConnector.scala:109)
at org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:100)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:607)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:597)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2173)
at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2173)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Kafka工作正常,但火花流表明连接到zookeeper服务存在问题。
答案 0 :(得分:1)
我犯了同样的错误。我通过将端口号9092更改为2181(来自zoo.cfg - property clientPort = 2181)解决了这个问题。
答案 1 :(得分:1)
您提供了Kafka经纪人的端口,您应该提供Zookeeper的端口(正如您在documentation中看到的那样),默认情况下实际上是2181,尝试使用localhost:2181
代替{ {1}}。这应该可以解决问题(假设你有Kafka和Zookeper运行)。