py4j.protocol.Py4JavaError:调用o22.start时发生错误

时间:2020-05-24 03:42:56

标签: python apache-spark pyspark apache-kafka

我现在正在尝试将SparkStreaming和Kafka一起放在Ubantu上。但是问题来了。

我可以确保Kafka正常工作。

在第一个终端上:

bin/zookeeper-server-start.sh config/zookeeper.properties &
bin/kafka-server-start.sh config/server.properties

在第二个终端上:

bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic wordsendertest

然后,我创建一些数据:

bin/kafka-console-producer.sh --broker-list localhost:9092 --topic wordsendertest
hello hadoop
hello spark

在第三个终端上:

cd /usr/local/spark/mycode/kafka
/usr/local/spark/bin/spark-submit ./kafkaWordCount.py localhost:2181 wordsendertest

kafkaWordCount.py的代码:

from __future__ import print_function
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
import sys

if __name__ == "__main__":
   if len(sys.argv) != 3:
      print("usage:KafkaWordCount.py<zk><topic>",file=sys.stderr)
      exit(-1)
   sc = SparkContext(appName="PythonStreamingKafkaWordCount")
   ssc = StreamingContext(sc,1)
   zkQuorum,topic = sys.argv[1:]
   kvs = KafkaUtils.createStream(ssc,zkQuorum,"spark-streaming-consumer",{topic:1})
   lines = kvs.map(lambda x:x[1])
   counts = lines.flatMap(lambda x:x.split(" ")).map(lambda word:(word,1)).reduceByKey(lambda a,b:a+b)
   counts.pprint
   ssc.start()
   ssc.awaitTermination()

我的错误:

Traceback (most recent call last):
  File "/usr/local/spark/mycode/kafka/./KafkaWordCount.py", line 20, in <module>
    ssc.start()
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/streaming/context.py", line 196, in start
  File "/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1133, in __call__
  File "/usr/local/spark/python/lib/py4j-0.10.4-src.zip/py4j/protocol.py", line 319, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o22.start.
: java.lang.IllegalArgumentException: requirement failed: No output operations registered, so nothing to execute

请帮助我!谢谢!

1 个答案:

答案 0 :(得分:1)

您忘记在()函数中添加counts.pprint

counts.pprint更改为counts.pprint(),它将起作用。