无法运行pyspark:找不到Spark jars目录

时间:2017-09-06 14:26:41

标签: macos hadoop pyspark

我已下载spark-2.1.0-bin-without-hadoop,它位于以下目录中:

 ~/Desktop/ahajib/opt/spark-2.1.0-bin-without-hadoop

当我进入该目录然后bin并尝试运行pyspark时,我收到以下错误:

/usr/local/bin/pyspark: line 24: ~/Desktop/ahajib/opt/spark-2.1.0-bin-without-hadoop/bin/load-spark-env.sh: No such file or directory
/Users/ahajibagheri/Desktop/ahajib/opt/spark-2.1.0-bin-without-hadoop/bin/spark-class: line 24: ~/Desktop/ahajib/opt/spark-2.1.0-bin-without-hadoop/bin/load-spark-env.sh: No such file or directory
Failed to find Spark jars directory (~/Desktop/ahajib/opt/spark-2.1.0-bin-without-hadoop/assembly/target/scala-/jars).
You need to build Spark with the target "package" before running this program.

我已经设置了我的JAVA_HOME和SPARK_HOME:

$JAVA_HOME
/Library/Java/JavaVirtualMachines/jdk1.8.0_131.jdk/Contents/Home
echo $SPARK_HOME
~/Desktop/ahajib/opt/spark-2.1.0-bin-without-hadoop

我在macOS Sierra 10.12.6上运行所有内容。关于这个问题的任何帮助将不胜感激。如果我遗漏了某些内容,请告诉我,以便我可以相应地更新问题。

由于

2 个答案:

答案 0 :(得分:1)

我有同样的问题。为了解决这个问题,我必须定义SPARK_HOME而没有主目录的快捷方式(~)。我认为在你的情况下它应该是这样的:

export SPARK_HOME="/Users/ahajibagheri/Desktop/ahajib/opt/spark-2.1.0-bin-without-hadoop"

答案 1 :(得分:0)

在我的情况下,我通过pip3 install pyspark安装spark,并且错误的SPARK_HOME变量引起了错误。当我运行如下命令时,它可以工作:

PYSPARK_PYTHON=python3 SPARK_HOME=/usr/local/lib/python3.7/site-packages/pyspark python3 wordcount.py a.txt