Pyspark:错误 - 在向驱动程序发送端口号之前退出Java网关进程

时间:2017-02-24 23:19:23

标签: pyspark

当我尝试在Pyspark中实例化Spark会话时,我收到此错误:Exception: Java gateway process exited before sending the driver its port number。这是代码

from pyspark import SparkConf
from pyspark.sql import SparkSession

if __name__ == '__main__':
    SPARK_CONFIGURATION = SparkConf().setAppName("OPL").setMaster("local[*]")
    SPARK_SESSION = SparkSession.builder\
        .config(conf=SPARK_CONFIGURATION)\
        .getOrCreate()

    print("Hello world")

这是追溯

Neon was unexpected at this time.
Traceback (most recent call last):
  File "C:\Users\IBM_ADMIN\Documents\Eclipse Neon for Liberty on Bluemix\OPL_Interface\src\Test\SparkTest.py", line 12, in <module>
    .config(conf=SPARK_CONFIGURATION)\
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\sql\session.py", line 169, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\context.py", line 307, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\context.py", line 115, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\context.py", line 256, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "C:\Users\IBM_ADMIN\Documents\spark-2.1.0-bin-hadoop2.7\python\pyspark\java_gateway.py", line 95, in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
Exception: Java gateway process exited before sending the driver its port number

我在Eclipse Eclipse Neon.2 Release(4.6.2)中使用PyDev。这是配置: Libraries Environment

注意:我使用的是最新的Spark版本:spark-2.1.0-bin-hadoop2.7

我检查了其他几个条目Pyspark: Exception: Java gateway process exited before sending the driver its port number Spark + Python - Java gateway process exited before sending the driver its port number? 并尝试了大多数建议的修复程序,但错误仍然存​​在。它对我来说是一个拦截器,因为在我得到SparkSession之前我无法测试我的代码。顺便说一句,我也在使用Java中的Spark,我在那里也没有相同的问题。

这是Pyspark的错误吗?

1 个答案:

答案 0 :(得分:0)

我和我的同事也都遇到了同样的问题,这使我们受阻,并让我们拉了一段时间。我们尝试了很多建议(全部无效),所有建议(Java路径中没有空格,设置/取消设置PYSPARK_SUBMIT_ARGS env var,...)。

为我们修复的问题是切换到Spark 2.3.1。我们正在尝试使用2.2.1和2.3.0。

希望这可以帮助一些人省掉一些麻烦。