pyspark'SparkSession'对象没有属性'_jssc'

时间:2018-10-30 09:57:28

标签: apache-spark apache-kafka

我正在使用: Hadoop 2.6.0-cdh5.14.2 SPARK2-2.3.0.cloudera2-1.cdh5.13.3.p0.316101

从KafkaUtils启动directStream时出现此错误:

  File "/home/ale/amazon_fuse_ds/bin/hdp_amazon_fuse_aggreagation.py", line 91, in setupContexts
kafka_stream = KafkaUtils.createDirectStream( self.spark_streaming_context, [ self.kafka_topicin ], kafka_configuration )
  File "/opt/cloudera/parcels/SPARK2-2.3.0.cloudera2-1.cdh5.13.3.p0.316101/lib/spark2/python/lib/pyspark.zip/pyspark/streaming/kafka.py", line 145, in createDirectStream
 AttributeError: 'SparkSession' object has no attribute '_jssc'

,我看到SparkSession有_jsc方法,但有_jssc。 enter image description here

1 个答案:

答案 0 :(得分:0)

您传递的对象是SparkSession,为什么要传递StreamingContext

from pyspark.streaming import StreamingContext

ssc = StreaminContext(self.spark_streaming_context.sparkContext, batchDuration)
KafkaUtils.createDirectStream(ssc, ...)