我是Spark的新手,我正忙着设置启用了HA的Spark Cluster。
通过以下方式启动火花外壳进行测试时:container
我收到以下错误(请参阅下面的完整错误):bash spark-shell --master yarn --deploy-mode client
应用程序在纱线网应用程序上标记为失败,并且没有启动容器。
通过file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip does not exist
启动shell时,它会打开而不会出错。
我注意到文件只被写入创建shell的节点上的tmp文件夹。
任何帮助将不胜感激。如果需要更多信息,请与我们联系。
环境变量:
HADOOP_CONF_DIR = /选择/ Hadoop的2.7.3的/ etc / hadoop的/
YARN_CONF_DIR = /选择/ Hadoop的2.7.3的/ etc / hadoop的/
SPARK_HOME = /选择/火花2.0.2彬hadoop2.7 /
完整错误消息:
spark-shell --master local
纱-site.xml中
16/11/30 21:08:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/11/30 21:08:49 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
16/11/30 21:09:03 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_e14_1480532715390_0001_02_000003 on host: slave2. Exit status: -1000. Diagnostics: File file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip does not exist
java.io.FileNotFoundException: File file:/tmp/spark-126d2844-5b37-461b-98a4-3f3de5ece91b/__spark_libs__3045590511279655158.zip
does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
16/11/30 22:29:28 ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED! 16/11/30 22:29:28 ERROR spark.SparkContext: Error initializing SparkContext. java.lang.IllegalStateException: Spark context stopped while waiting for backend
at org.apache.spark.scheduler.TaskSchedulerImpl.waitBackendReady(TaskSchedulerImpl.scala:584)
at org.apache.spark.scheduler.TaskSchedulerImpl.postStartHook(TaskSchedulerImpl.scala:162)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:546)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2258)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
at org.apache.spark.repl.Main$.createSparkSession(Main.scala:95)
at $line3.$read$$iw$$iw.<init>(<console>:15)
at $line3.$read$$iw.<init>(<console>:31)
at $line3.$read.<init>(<console>:33)
at $line3.$read$.<init>(<console>:37)
at $line3.$read$.<clinit>(<console>)
at $line3.$eval$.$print$lzycompute(<console>:7)
at $line3.$eval$.$print(<console>:6)
at $line3.$eval.$print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786)
at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638)
at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637)
at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31)
at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19)
at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569)
at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565)
at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807)
at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681)
at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:38)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:37)
at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214)
at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:37)
at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:94)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909)
at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97)
at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909)
at org.apache.spark.repl.Main$.doMain(Main.scala:68)
at org.apache.spark.repl.Main$.main(Main.scala:51)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
答案 0 :(得分:1)
此错误是由core-site.xml文件中的配置引起的。
请注意,要查找此文件您的
HADOOP_CONF_DIR
env变量 必须设定。在我的情况下,我添加了
HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/
./conf/spark-env.sh
<强>芯-site.xml中强>
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property>
</configuration>
如果此端点无法访问,或者Spark检测到文件系统与当前系统相同,则lib文件将不会分发到群集中的其他节点,从而导致上述错误。
在我的情况下,我所在的节点无法到达指定主机上的端口9000。
<强>调试强>
将日志级别提升至info。您可以通过以下方式执行此操作:
将./conf/log4j.properties.template
复制到./conf/log4j.properties
在文件集log4j.logger.org.apache.spark.repl.Main = INFO
正常启动Spark Shell。如果您的问题与我的相同,您应该看到一条信息消息,例如:INFO Client: Source and destination file systems are the same. Not copying file:/tmp/spark-c1a6cdcd-d348-4253-8755-5086a8931e75/__spark_libs__1391186608525933727.zip
这会导致您遇到问题,因为它会启动因丢失文件而导致的列车反应。
答案 1 :(得分:0)
我没有在您的日志中看到任何错误,只有警告可以通过添加环境变量来避免:
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
例外:尝试手动设置纱线的火花配置: http://badrit.com/blog/2015/2/29/running-spark-on-yarn#.WD_e66IrJsM
hdfs dfs -mkdir -p /user/spark/share/lib<br>
hdfs dfs -put $SPARK_HOME/assembly/lib/spark-assembly_*.jar /user/spark/share/lib/spark-assembly.jar<br>
export SPARK_JAR=hdfs://your-server:port/user/spark/share/lib/spark-assembly.jar
希望得到这个帮助。
答案 2 :(得分:0)
您必须在spark会话中将配置设置为master(“ local [*]”)。我已将其删除,并且可以正常工作。
答案 3 :(得分:0)
就我而言,spark 用户帐户无法读取/递归到 HADOOP_HOME,因此无法读取 core-site.xml。
spark@ubuntu$ ls -lrt /opt/hadoop/
ls: cannot open directory '/opt/hadoop/': Permission denied <--- Cannot read the directory
spark@ubuntu$ ls -lrt /opt
total 20
drwxrwx--- 3 hadoop 1003 4096 Jun 18 20:38 hadoop <---- Invalid group
drwxr-xr-x 3 spark spark 4096 Jun 19 04:24 spark
建议运行 ls -la $HADOOP_CONF_DIR
以确保提交 Spark 作业的帐户可以读取 core-site.xml。