我正在尝试使用python使用Kafka在Spark结构化流上创建POC,下面是代码。
火花版本-2.3.2 卡夫卡-2.11-2.1.0 Hadoop-2.8.3
tell application "System Events"
name of disk items of folder "Macintosh HD:Path:to:my:folder:"
end tell
在提交火花时低于错误。
。\ bin \ spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.3.2 spark_struct.py 本地主机:9092 tempre
spark = SparkSession \
.builder \
.appName("StructuredNetworkWordCount") \
.getOrCreate()
brokers, topic = sys.argv[1:]
print("broker : {} and Topic : {}".format(brokers,topic))
df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", brokers) \
.option("subscribe", topic) \
.load()
numbericdf = df.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")
numbericdf.createOrReplaceTempView("updates")
average = spark.sql("select value from updates")
print(average)
query = average \
.writeStream \
.outputMode("append") \
.format("console")\
.start()
query.awaitTermination()
答案 0 :(得分:0)
从Java 11移至Java 8后,此问题已解决。