尝试使用Pyspark2.0.2-hadoop2.7在使用Python2.7.x时提供错误
代码为:
import os
import sys
os.environ['SPARK_HOME']="C:/Apache/spark-2.0.2-bin-hadoop2.7"
sys.path.append("C:/Apache/spark-2.0.2-bin-hadoop2.7/python")
try:
from pyspark import SparkContext
from pyspark import SparkConf
print("Succesfull")
except ImportError as e:
print("Cannot import PYspark module", e)
sys.exit(1)
因为我运行此代码提供"无法导入PYspark模块"消息。
由于
答案 0 :(得分:0)
通过pyspark和py4j扩展python路径,对于spark 2.0.2,它将是:
sys.path.append("C:/Apache/spark-2.0.2-bin-hadoop2.7/python/lib/py4j-0.10.3-src.zip")
sys.path.append("C:/Apache/spark-2.0.2-bin-hadoop2.7/python/lib/pyspark.zip")