似乎无法初始化Spark上下文(pyspark)

时间:2019-01-21 19:31:06

标签: python apache-spark ubuntu pyspark

当我尝试运行sc = SparkContext(appName="exampleName")时,我在下面包括了整个错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/sharan/.local/lib/python3.5/site-packages/pyspark/context.py", line 118, in __init__
    conf, jsc, profiler_cls)
  File "/home/sharan/.local/lib/python3.5/site-packages/pyspark/context.py", line 188, in _do_init
    self._javaAccumulator = self._jvm.PythonAccumulatorV2(host, port)
  File "/home/sharan/.local/lib/python3.5/site-packages/py4j/java_gateway.py", line 1525, in __call__
    answer, self._gateway_client, None, self._fqn)
  File "/home/sharan/.local/lib/python3.5/site-packages/py4j/protocol.py", line 332, in get_return_value
    format(target_id, ".", name, value))
py4j.protocol.Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace:
py4j.Py4JException: Constructor org.apache.spark.api.python.PythonAccumulatorV2([class java.lang.String, class java.lang.Integer]) does not exist
    at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179)
    at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196)
    at py4j.Gateway.invoke(Gateway.java:237)
    at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
    at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:748)

我不知道如何调试它。我可以访问任何日志吗?我是否缺少应该在ubuntu计算机上安装的特定软件包?

1 个答案:

答案 0 :(得分:1)

这是因为pyspark版本与spark版本不同。如果您安装了 spark 2.4.7 版,那么也请使用 pyspark 2.4.7 版。

要获取 spark 版本,请在 spark UI 上查看或使用以下任何命令

spark-submit --version 要么 spark-shell --version 要么 spark-sql --version

要安装特定版本的pyspark,请使用以下命令

pip install pyspark==2.4.7