很抱歉问一个我以前在这里看到的问题,但是我所经历的所有答案似乎都无法解决这个问题。我已遵循在本地计算机上运行pyspark的安装文档。完成后,我正尝试使用
来测试安装# Start pyspark via provided command
import pyspark
# Below code is Spark 2+
spark = pyspark.sql.SparkSession.builder.appName('test').getOrCreate()
spark.range(10).collect()
但是我仍然收到以下错误:
/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/bin/spark-class: line 71: /usr/bin/java/bin/java: Not a directory
Traceback (most recent call last):
File "test.py", line 5, in <module>
spark = pyspark.sql.SparkSession.builder.appName('test').getOrCreate()
File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/sql/session.py", line 173, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 349, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 115, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 298, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "/Users/usr123/opt/anaconda3/lib/python3.7/site-packages/pyspark/java_gateway.py", line 94, in launch_gateway
raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number
有没有人找到确保纠正此问题的好方法?有什么明显的我想念的东西吗?
答案 0 :(得分:0)
我们遇到了类似的问题,因为我们将python版本降至3.6可以解决问题,这似乎是我们案例中conda环境不兼容的情况。 您尝试运行哪个确切版本的spark? 2.1、2.3、2.4之间可能会有实质性差异。