有人可以帮忙吗?
我在Jupyter Notebook上使用PySpark。我想从另一个RDD创建一个RDD。
dataRDD = sc.parallelize("welcome to pySpark by brightrace academy".split(" "))
newDataRDD = dataRDD.map(lambda line: line.upper())
newDataRDD.collect()
尝试从现有RDD创建RDD时出现错误:
错误:
Py4JJavaError Traceback (most recent call last)
<ipython-input-10-de8c35a40e72> in <module>
2
3 newDataRDD = dataRDD.map(lambda line: line.upper())
----> 4 newDataRDD.collect()
C:\SPARK\python\pyspark\rdd.py in collect(self)