我在YARN模式下可以很好地完成此提交的作业,但是出现以下错误
我添加了一个本地安装的Spark jar,但是Spark jar也可以位于HDFS上的世界可读位置。这样,YARN可以将其缓存在节点上,并添加.bashrc yarn_config_dir和hadoop_config_dir。
错误:
找不到或加载主类org.apache.spark.deploy.yarn.ExecutorLauncher
9068548 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.deploy.yarn.Client - Application report for application_1531990849146_0010 (state: FAILED)
9068548 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.deploy.yarn.Client -
client token: N/A
diagnostics: Application application_1531990849146_0010 failed 2 times due to AM Container for appattempt_1531990849146_0010_000002 exited with exitCode: 1
Failing this attempt.Diagnostics: [2018-07-19 11:56:58.484]Exception from container-launch.
Container id: container_1531990849146_0010_02_000001
Exit code: 1
[2018-07-19 11:56:58.484]
[2018-07-19 11:56:58.486]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
[2018-07-19 11:56:58.486]
[2018-07-19 11:56:58.486]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher
[2018-07-19 11:56:58.486]
For more detailed output, check the application tracking page: http://localhost:8088/cluster/app/application_1531990849146_0010 Then click on links to logs of each attempt.
. Failing the application.
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1532001413903
final status: FAILED
tracking URL: http://localhost:8088/cluster/app/application_1531990849146_0010
user: root
9068611 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.deploy.yarn.Client - Deleted staging directory file:/root/.sparkStaging/application_1531990849146_0010
9068612 [bioingine-management-service-akka.actor.default-dispatcher-14] ERROR org.apache.spark.SparkContext - Error initializing SparkContext.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at com.bioingine.smash.management.services.SmashExtractorService.getSparkSession(SmashExtractorService.scala:79)
at com.bioingine.smash.management.services.SmashExtractorService.getFileHeaders(SmashExtractorService.scala:83)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
9068618 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.s.jetty.server.AbstractConnector - Stopped Spark@2152c728{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
9068619 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.ui.SparkUI - Stopped Spark web UI at http://localhost:4040
9068620 [dispatcher-event-loop-16] WARN o.a.s.s.c.YarnSchedulerBackend$YarnSchedulerEndpoint - Attempted to request executors before the AM has registered!
9068621 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.a.s.s.c.YarnClientSchedulerBackend - Shutting down all executors
9068621 [dispatcher-event-loop-17] INFO o.a.s.s.c.YarnSchedulerBackend$YarnDriverEndpoint - Asking each executor to shut down
9068622 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.a.s.s.c.SchedulerExtensionServices - Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
9068622 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO o.a.s.s.c.YarnClientSchedulerBackend - Stopped
9068623 [dispatcher-event-loop-20] INFO o.a.s.MapOutputTrackerMasterEndpoint - MapOutputTrackerMasterEndpoint stopped!
9068624 [bioingine-management-service-akka.actor.default-dispatcher-14] ERROR org.apache.spark.util.Utils - Uncaught exception in thread bioingine-management-service-akka.actor.default-dispatcher-14
java.lang.NullPointerException: null
at org.apache.spark.network.shuffle.ExternalShuffleClient.close(ExternalShuffleClient.java:141)
at org.apache.spark.storage.BlockManager.stop(BlockManager.scala:1485)
at org.apache.spark.SparkEnv.stop(SparkEnv.scala:90)
at org.apache.spark.SparkContext$$anonfun$stop$11.apply$mcV$sp(SparkContext.scala:1937)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1317)
at org.apache.spark.SparkContext.stop(SparkContext.scala:1936)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:587)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at com.bioingine.smash.management.services.SmashExtractorService.getSparkSession(SmashExtractorService.scala:79)
at com.bioingine.smash.management.services.SmashExtractorService.getFileHeaders(SmashExtractorService.scala:83)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
9068624 [bioingine-management-service-akka.actor.default-dispatcher-14] INFO org.apache.spark.SparkContext - Successfully stopped SparkContext
9068627 [bioingine-management-service-akka.actor.default-dispatcher-15] ERROR akka.actor.ActorSystemImpl - Error during processing of request: 'Yarn application has already ended! It might have been killed or unable to launch application master.'. Completing with 500 Internal Server Error response. To change default exception handling behavior, provide a custom ExceptionHandler.
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master.
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:85)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2509)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:909)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:901)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:901)
at com.bioingine.smash.management.services.SmashExtractorService.getSparkSession(SmashExtractorService.scala:79)
at com.bioingine.smash.management.services.SmashExtractorService.getFileHeaders(SmashExtractorService.scala:83)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at com.bioingine.smash.management.services.SmashService$$anonfun$getcolumnHeaders$1.apply(SmashService.scala:90)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:39)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:415)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
**SparkSession configuration**
new SparkConf().setMaster("yarn").setAppName("Test").set("spark.executor.memory", "3g")
.set("spark.ui.enabled","true")
.set("spark.driver.memory","9g")
.set("spark.default.parallelism","10")
.set("spark.executor.cores","3")
.set("spark.cores.max","9")
.set("spark.memory.offHeap.enabled","true")
.set("spark.memory.offHeap.size","6g")
.set("spark.yarn.am.memory","2g")
.set("spark.yarn.am.cores","2")
.set("spark.yarn.am.cores","2")
.set("spark.yarn.archive","hdfs://localhost:9000/user/spark/share/lib/spark2-hdp-yarn-archive.tar.gz")
.set("spark.yarn.jars","hdfs://localhost:9000/user/spark/share/lib/spark-yarn_2.11.2.2.0.jar")
**We are added below configuration**
1. These entries are in $SPARK_HOME/conf/spark-defaults.conf
spark.driver.extraJavaOptions -Dhdp.version=2.9.0
spark.yarn.am.extraJavaOptions -Dhdp.version=2.9.0
log4j.rootCategory=WARN, console
log4j.appender.console=org.apache.log4j.ConsoleAppender
2. yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
<description>shuffle service that needs to be set for Map Reduce to run</description>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>/usr/share/hadoop/etc/hadoop,/usr/share/hadoop/,/usr/share/hadoop/lib/,/usr/share/hadoop/share/hadoop/common/,/usr/share/hadoop/share/hadoop/common/lib, /usr/share/hadoop/share/hadoop/hdfs/,/usr/share/hadoop/share/hadoop/hdfs/lib/,/usr/share/hadoop/share/hadoop/mapreduce/,/usr/share/hadoop/share/hadoop/mapreduce/lib/,/usr/share/hadoop/share/hadoop/tools/lib/,/usr/share/hadoop/share/hadoop/yarn/,/usr/share/hadoop/share/hadoop/yarn/lib/*,/usr/share/spark/jars/spark-yarn_2.11-2.2.0.jar</value>
</property>
</configuration>
3.Spark-env.sh
export HADOOP_CONF_DIR=/home/hadoop/hadoop/etc/hadoop
export SPARK_HOME=/home/hadoop/spark
SPARK_DIST_CLASSPATH="/usr/share/spark/jars/*"
4. .bashrc
export JAVA_HOME="/usr/lib/jvm/java-8-openjdk-amd64/"
export SBT_OPTS="-Xms16G -Xmx16G"
export HADOOP_INSTALL=/usr/share/hadoop
export HADOOP_CONF_DIR=/usr/share/hadoop/etc/hadoop/
export YARN_CONF_DIR=/usr/share/hadoop/etc/hadoop/
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export SPARK_CLASSPATH="/usr/share/spark/jars/*"
export SPARK_HOME="/usr/share/spark/"
export PATH=$PATH:$SPARK_HOME