Spark结构化流式查询异常

时间:2018-03-03 10:59:29

标签: spark-streaming

这是我的流媒体代码

session = get_session(SparkConf())

lookup = '/Users/vahagn/stream'
userSchema = StructType().add("auction_id", "string").add("dma", "string")
auctions = session.readStream.schema(userSchema).json("/Users/vahagn/stream/")
inputDF = auctions.groupBy("auction_id").count()
print inputDF.isStreaming

inputDF.printSchema()
inputDF.writeStream.outputMode("update").format("console").start().awaitTermination()

在阅读完第一个文件后,我收到了错误,但没有解释任何问题。 有什么想法吗?

Traceback (most recent call last):
  File "/Users/vahagn/hydra/spark/structured_streaming.py", line 257, in <module>
    inputDF.writeStream.outputMode("update").format("console").start().awaitTermination()
  File "/Users/vahagn/Downloads/spark-2.3.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/streaming.py", line 106, in awaitTermination
  File "/Users/vahagn/Downloads/spark-2.3.0-bin-hadoop2.7/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py", line 1160, in __call__
  File "/Users/vahagn/Downloads/spark-2.3.0-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/sql/utils.py", line 75, in deco
pyspark.sql.utils.StreamingQueryException: u'null\n=== Streaming Query ===\nIdentifier: [id = 2f4b442a-38f9-41f1-a3d4-52e0a48427c0, runId = b843f25f-4132-4d52-ae64-f3be5e85a3d9]\nCurrent Committed Offsets: {}\nCurrent Available Offsets: {FileStreamSource[file:/Users/vahagn/stream]: {"logOffset":0}}\n\nCurrent State: ACTIVE\nThread State: RUNNABLE\n\nLogical Plan:\nAggregate [auction_id#0], [auction_id#0, count(1) AS count#7L]\n+- StreamingExecutionRelation FileStreamSource[file:/Users/vahagn/stream], [auction_id#0, dma#1]\n'

1 个答案:

答案 0 :(得分:0)

我已经通过将java9降级为java8解决了问题。