py4j.protocol.Py4JError在PySpark中使用jar文件

时间:2019-01-02 10:42:41

标签: python apache-spark jar jvm py4j

我正在尝试使用可在spark上运行的Scala的XGBoost库。为此,我正在使用jar文件。我从xgboost4j-spark_2.11-0.80-p1.jarxgboost4j_2.11-0.80-p1.jar下载了scala xgboost的jar文件。我将它们放在jars的{​​{1}}文件夹中。

这样做之后,我尝试了以下一行

$SPARK_HOME

这导致了以下错误

sc._jvm.ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier("dasd")

最初,我认为这是因为系统无法找到罐子。为了检验这个假设,我在Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/py4j/java_gateway.py", line 1159, in send_command raise Py4JNetworkError("Answer from Java side is empty") py4j.protocol.Py4JNetworkError: Answer from Java side is empty During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/py4j/java_gateway.py", line 985, in send_command response = connection.send_command(command) File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/py4j/java_gateway.py", line 1164, in send_command "Error while receiving", e, proto.ERROR_ON_RECEIVE) py4j.protocol.Py4JNetworkError: Error while receiving at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 45 more Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/py4j/java_gateway.py", line 1598, in __getattr__ raise Py4JError("{0} does not exist in the JVM".format(new_fqn)) py4j.protocol.Py4JError: ml.dmlc.xgboost4j.scala.spark.XGBoostClassifier does not exist in the JVM 文件夹中添加了一个新的jar。广口瓶位于here。我运行了以下行

$SPARK_HOME/jars

它运行没有任何问题。我开始相信问题出在jar文件中,但是我不确定如何从这里继续。

这就是我正在使用的

>>> sc._jvm.com.ippontech.Hello.Hello.hello('cat')
hello, cat

0 个答案:

没有答案