我在Windows上安装了Spark 1.1.1和Hadoop 2.4.0。
在spark-submit
模式下使用yarn-client
提交作业时,如下所示:
bin\spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client lib\spark-examples-1.1.1-hadoop2.4.0.jar 5
我收到以下错误:
14/12/12 11:18:35 INFO YarnClientSchedulerBackend: Application report from ASM:
appMasterRpcPort: -1
appStartTime: 1418411892899
yarnAppState: FAILED
Exception in thread "main" org.apache.spark.SparkException: Yarn application already ended,might be killed or not able to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApp(YarnClientSchedulerBackend.scala:117)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:93)
检查纱线日志,我可以看到以下错误:
Log Contents:
java.lang.NoClassDefFoundError: '-Dspark/tachyonStore/folderName=spark-3cb0e049788b-487f-8f5d-3d52c6ad209b'
Caused by: java.lang.ClassNotFoundException: '-Dspark.tachyonStore.folderName=sark-3cb0e049-788b-487f-8f5d-3d52c6ad209b'
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: '-Dspark.tachyonStore.folderName=spark-3cb0e049-88b-487f-8f5d-3d52c6ad209b'. Program will exit.
Exception in thread "main"
检查命令本身后,
14/12/11 12:34:51 INFO Client: command: %JAVA_HOME%/bin/java -server -Xmx512m
-Djava.io.tmpdir=%PWD%/tmp '-Dspark.tachyonStore.folderName=spark-3cb0e049-788b-
487f-8f5d-3d52c6ad209b' '-Dspark.yarn.secondary.jars=' '-Dspark.driver.host=dev1
-win-vm' '-Dspark.driver.appUIHistoryAddress=' '-Dspark.app.name=Spark Pi' '-Dsp
ark.driver.appUIAddress=dev1-win-vm:4040' '-Dspark.jars=file:/C:/deploy/spark-1.
1.1-bin-hadoop2.4/lib/spark-examples-1.1.1-hadoop2.4.0.jar' '-Dspark.fileserver.
uri=http://192.168.53.94:53420' '-Dspark.master=yarn-client' '-Dspark.driver.por
t=53416' org.apache.spark.deploy.yarn.ExecutorLauncher --class 'notused' --jar
null --arg 'dev1-win-vm:53416' --executor-memory 1024 --executor-cores 1 --num
-executors 2 1> <LOG_DIR>/stdout 2> <LOG_DIR>/stderr
我怀疑如果引用-D标志是用双引号而不是单引号完成的,那么该命令应该可以工作。
有人知道如何解决这个问题吗?我想知道是否有一些设置来控制它。