当我尝试运行sc = SparkContext(appName="exampleName")
时,我在下面包括了整个错误:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/sharan/.local/lib/python3.5/site-packages/pyspark/context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "/home/sharan/.local/lib/python3.5/site-packages/pyspark/context.py", line 188, in _do_init
self._javaAccumulator = self._jvm.PythonAccumulatorV2(host, port)
File "/home/sharan/.local/lib/python3.5/site-packages/py4j/java_gateway.py", line 1525, in __call__
answer, self._gateway_client, None, self._fqn)
File "/home/sharan/.local/lib/python3.5/site-packages/py4j/protocol.py", line 332, in get_return_value
format(target_id, ".", name, value))
py4j.protocol.Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace:
py4j.Py4JException: Constructor org.apache.spark.api.python.PythonAccumulatorV2([class java.lang.String, class java.lang.Integer]) does not exist
at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:179)
at py4j.reflection.ReflectionEngine.getConstructor(ReflectionEngine.java:196)
at py4j.Gateway.invoke(Gateway.java:237)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
我不知道如何调试它。我可以访问任何日志吗?我是否缺少应该在ubuntu计算机上安装的特定软件包?
答案 0 :(得分:1)
这是因为pyspark版本与spark版本不同。如果您安装了 spark 2.4.7 版,那么也请使用 pyspark 2.4.7 版。
要获取 spark 版本,请在 spark UI 上查看或使用以下任何命令
spark-submit --version
要么
spark-shell --version
要么
spark-sql --version
要安装特定版本的pyspark,请使用以下命令
pip install pyspark==2.4.7