我正在使用: Hadoop 2.6.0-cdh5.14.2 SPARK2-2.3.0.cloudera2-1.cdh5.13.3.p0.316101
从KafkaUtils启动directStream时出现此错误:
File "/home/ale/amazon_fuse_ds/bin/hdp_amazon_fuse_aggreagation.py", line 91, in setupContexts
kafka_stream = KafkaUtils.createDirectStream( self.spark_streaming_context, [ self.kafka_topicin ], kafka_configuration )
File "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera2-1.cdh5.13.3.p0.316101/lib/spark2/python/lib/pyspark.zip/pyspark/streaming/kafka.py", line 145, in createDirectStream
AttributeError: 'SparkSession' object has no attribute '_jssc'
答案 0 :(得分:0)
您传递的对象是SparkSession
,为什么要传递StreamingContext
。
from pyspark.streaming import StreamingContext
ssc = StreaminContext(self.spark_streaming_context.sparkContext, batchDuration)
KafkaUtils.createDirectStream(ssc, ...)