我正在处理之前已正确使用数据框执行的代码,但是这一次执行时,出现错误。 (唯一的区别是这次我在数据帧上使用了persist()
。)
simMat = IndexedRMat.columnSimilarities()
正确执行,然后执行此部分:
columns = ['product1', 'product2', 'sim']
vals = simMat.entries.map(lambda e: (e.i, e.j, e.value)).collect()
dfsim = spark.createDataFrame(vals, columns)
产生此错误:
AttributeErrorTraceback (most recent call last)
<ipython-input-100-11502084c71b> in <module>()
1 columns = ['product1', 'product2', 'sim']
----> 2 vals = simMat.entries.map(lambda e: (e.i, e.j, e.value)).collect()
3 dfsim = spark.createDataFrame(vals, columns)
/opt/spark-2.3.0-SNAPSHOT-bin-spark-master/python/pyspark/rdd.pyc in collect(self)
806 to be small, as all the data is loaded into the driver's memory.
807 """
--> 808 with SCCallSiteSync(self.context) as css:
809 port = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
810 return list(_load_from_socket(port, self._jrdd_deserializer))
/opt/spark-2.3.0-SNAPSHOT-bin-spark-master/python/pyspark/traceback_utils.pyc in __enter__(self)
70 def __enter__(self):
71 if SCCallSiteSync._spark_stack_depth == 0:
---> 72 self._context._jsc.setCallSite(self._call_site)
73 SCCallSiteSync._spark_stack_depth += 1
74
AttributeError: 'NoneType' object has no attribute 'setCallSite'
是什么意思?我是新来的人,却找不到这种错误的解释。