嗨,我是新事物。我有一个用例,以窗口方式从kafka主题获取数据流,然后在其上进行分析。 我尝试了下面的代码,并抛出错误:
from pyspark.sql import SparkSession
spark=SparkSession.builder.appName("TestKakfa").getOrCreate()
df = spark \
.readStream \
.format("kafka") \
.option("kafka.bootstrap.servers", "localhost:9092") \
.option("subscribe", "dummy_events") \
.option("startingOffsets", "earliest") \
.load()
query = df.writeStream.format('console').start()
query.awaitTermination()
我遇到以下错误:
pyspark.sql.utils.StreamingQueryException: 'null\n=== Streaming Query ===\nIdentifier: [id = b842e3ba-8584-4764-8068-cce6b8d7de5f, runId = edc960c5-e5ef-4493-b9a2-c9543e1d2a8d]\nCurrent Committed Offsets: {}\nCurrent Available Offsets: {}\n\nCurrent State: INITIALIZING\nThread State: RUNNABLE'