应用错误收集

启动dfs，yarn和spark之后，我在主控主机上的spark根目录下运行这些代码：

MASTER=yarn ./bin/run-example ml.LogisticRegressionExample \ data/mllib/sample_libsvm_data.txt

实际上我从Spark的自述文件中获取了这些代码，这里是关于GitHub上LogisticRegressionExample的源代码：https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/LogisticRegressionExample.scala

然后发生错误：

Exception in thread "main" org.apache.spark.sql.AnalysisException: Path does not exist: hdfs://master:9000/user/root/data/mllib/sample_libsvm_data.txt;

首先，我不知道为什么hdfs://master:9000/user/root，我将namenode的IP地址设置为hdfs://master:9000，但为什么spark选择/user/root ？

然后，我在群集的每个主机上创建一个目录/user/root/data/mllib/sample_libsvm_data.txt，所以我希望spark可以找到这个文件。但同样的错误再次发生。请告诉我如何解决它。

在纱线上运行火花机学习示例失败

1 个答案: