java.util.NoSuchElementException:找不到密钥:_PYSPARK_DRIVER_CALLBACK_HOST

时间:2019-05-07 18:17:15

标签: python-3.x pyspark pycharm

我正在使用PyCharm 2019.1和Python 3.7(在Project Interpreter中) 在PyCharm上,我添加了Pyspark 2.4.2

当我运行以下代码(以创建Spark DataFrame)时,出现错误

java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CALLBACK_HOST
....
Exception: Java gateway process exited before sending its port number

从其他SO问题来看,这似乎与版本不匹配有关, 问题是如何解决这个问题

我的$ SPARK_HOME指向Apache Spark 2.2.0, 当我尝试在Pycharm上安装2.2.0时,出现错误

Collecting pyspark==2.2.0

Could not find a version that satisfies the requirement pyspark==2.2.0 (from versions: 2.1.2, 2.1.3, 2.2.0.post0, 2.2.1, 2.2.2, 2.2.3, 2.3.0, 2.3.1, 2.3.2, 2.3.3, 2.4.0, 2.4.1, 2.4.2, 2.4.3)
No matching distribution found for pyspark==2.2.0 

关于如何解决此问题的任何想法?

CODE->

from pyspark.sql import SparkSession

d = {'a':1, 'b':2, 'c':3}

spark = SparkSession.builder.master("local").appName("CreatingDF").getOrCreate()

pandaDF = spark.createDataFrame(d)
print(pandaDF)  

错误->

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/05/06 23:21:45 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[main,5,main]
java.util.NoSuchElementException: key not found: _PYSPARK_DRIVER_CALLBACK_HOST
    at scala.collection.MapLike$class.default(MapLike.scala:228)
    at scala.collection.AbstractMap.default(Map.scala:59)
    at scala.collection.MapLike$class.apply(MapLike.scala:141)
    at scala.collection.AbstractMap.apply(Map.scala:59)
    at org.apache.spark.api.python.PythonGatewayServer$$anonfun$main$1.apply$mcV$sp(PythonGatewayServer.scala:50)
    at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1262)
    at org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:37)
    at org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Traceback (most recent call last):
  File "/Users/karanalang/PycharmProjects/PythonFalcon/FalconIncremental/python_createDF2.py", line 28, in <module>
    spark = SparkSession.builder.master("local").appName("CreatingDF").getOrCreate()
  File "/Users/karanalang/anaconda3/lib/python3.7/site-packages/pyspark/sql/session.py", line 173, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/Users/karanalang/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 367, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/Users/karanalang/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 133, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/Users/karanalang/anaconda3/lib/python3.7/site-packages/pyspark/context.py", line 316, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/Users/karanalang/anaconda3/lib/python3.7/site-packages/pyspark/java_gateway.py", line 46, in launch_gateway
    return _launch_gateway(conf)
  File "/Users/karanalang/anaconda3/lib/python3.7/site-packages/pyspark/java_gateway.py", line 108, in _launch_gateway
    raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

1 个答案:

答案 0 :(得分:0)

是的,这是版本问题。在命令提示符/终端上验证您的python版本。如果默认的python版本是2.7并且pyCharm在其解释器中指向python3.7,则它应该可以工作。

主要是Anaconda3及以后的版本引起此问题。