Spark on Yarn错误:纱线应用程序已结束!它可能已被杀死或无法启动应用程序母版

时间:2019-08-27 19:41:56

标签: apache-spark yarn spark-shell

开始spark-shell --master yarn --deploy-mode client时出现错误:

  

纱线申请已经结束!它可能已被杀死或   无法启动应用程序母版。

这是来自Yarn的完整日志:

  

19/08/28 00:54:55 INFO client.RMProxy:连接到ResourceManager   在/0.0.0.0:8032

     

容器:container_1566921956926_0010_01_000001 on   rhel7-cloudera-dev_33917   ================================================== ============================ LogType:stderr日志上传时间:2019年8月28日00:46:31 LogLength:523日志   内容:SLF4J:类路径包含多个SLF4J绑定。 SLF4J:   发现于   [jar:file:/ yarn / local / usercache / rhel / filecache / 26 / __ spark_libs__5634501618166443611.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]   SLF4J:找到绑定   [jar:文件:/etc/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]   SLF4J:请参见http://www.slf4j.org/codes.html#multiple_bindings   说明。 SLF4J:实际绑定类型   [org.slf4j.impl.Log4jLoggerFactory] ​​

     

LogType:stdout日志上传时间:2019年8月28日00:46:31 LogLength:5597日志   内容:2019-08-28 00:46:19 INFO SignalUtils:54-注册信号   TERM的处理程序2019-08-28 00:46:19 INFO SignalUtils:54-已注册   HUP的信号处理程序2019-08-28 00:46:19 INFO SignalUtils:54-   INT 2019-08-28 00:46:19 INFO的已注册信号处理程序   SecurityManager:54-将视图ACL更改为:yarn,rhel 2019-08-28   00:46:19 INFO SecurityManager:54-将修改ACL更改为:yarn,rhel   2019-08-28 00:46:19 INFO SecurityManager:54-更改视图ACL   组到:2019-08-28 00:46:19 INFO SecurityManager:54-更改   将ACL群组修改为:2019-08-28 00:46:19 INFO SecurityManager:54-   SecurityManager:禁用身份验证; ui acls已禁用;使用者   具有查看权限:Set(yarn,rhel);具有查看权限的组:   组();具有修改权限的用户:Set(yarn,rhel);与   修改权限:Set()2019-08-28 00:46:20 INFO   ApplicationMaster:54-准备本地资源2019-08-28 00:46:21   INFO ApplicationMaster:54-ApplicationAttemptId:   appattempt_1566921956926_0010_000001 2019-08-28 00:46:21信息   ApplicationMaster:54-等待Spark驱动程序可访问。   2019-08-28 00:46:21 INFO ApplicationMaster:54-驱动程序现在可用:   rhel7-cloudera-dev:34872 2019-08-28 00:46:21信息   TransportClientFactory:267-成功创建到的连接   107毫秒(花费0毫秒后,rhel7-cloudera-dev / 192.168.56.112:34872   引导程序)2019-08-28 00:46:22 INFO ApplicationMaster:54-   ================================================== ============================ YARN执行程序启动上下文:env:       CLASSPATH-> {{PWD}} {{PWD}} / spark_conf {{PWD}} / spark_libs / $ HADOOP_CONF_DIR $ HADOOP_COMMON_HOME / share / hadoop / common / $ HADOOP_COMMON_HOME / share / hadoop / common / lib / $ HADOOP_HDFS_HOME / share / hadoop / hdfs / $ HADOOP_HDFS_HOME / share / hadoop / hdfs / lib / $ HADOOP_MAPRED_HOME / share / hadoop / mapreduce / $ HADOOP_MAPRED_HOME / share / hadoop / mapreduce / lib / $ HADOOP_YARN_HOME / share / hadoop / yarn / $ HADOOP_YARN_HOME / share / hadoop / yarn / lib / *           $ HADOOP_COMMON_HOME / $ HADOOP_COMMON_HOME / lib / $ HADOOP_HDFS_HOME / $ HADOOP_HDFS_HOME / lib / $ HADOOP_MAPRED_HOME / $ HADOOP_MAPRED_HOME / lib / $ HADOOP_YARN_HOME / $ HADOOP_YARN_HOME / lib / $ HADOOP_MAPRED_HOME / share / hadoop / mapreduce / $ HADOOP_MAPRED_HOME / share / hadoop / mapreduce / lib / /etc/hadoop-2.6.0/etc/hadoop: /etc/hadoop-2.6.0/share/hadoop/common/lib/:/etc/hadoop-2.6.0/share/hadoop/common/:/etc/hadoop-2.6.0/共享/hadoop/hdfs:/etc/hadoop-2.6.0/share/hadoop/hdfs/lib/:/etc/hadoop-2.6.0/share/hadoop/hdfs/:/etc/ hadoop-2.6.0 / share / hadoop / yarn / lib / :/ etc / hadoop-2.6.0 / share / hadoop / yarn / :/ etc / hadoop-2.6.0 / share / hadoop / mapreduce / lib / :/ etc / hadoop-2.6.0 / share / hadoop / mapreduce / :/ etc / hadoop-2.6.0 / contrib / capacity-scheduler / .jar { {PWD}} / spark_conf / hadoop_conf       SPARK_DIST_CLASSPATH-> /etc/hadoop-2.6.0/etc/hadoop:/etc/hadoop-2.6.0/share/hadoop/common/lib/:/etc/hadoop-2.6.0/share/hadoop /common/:/etc/hadoop-2.6.0/share/hadoop/hdfs:/etc/hadoop-2.6.0/share/hadoop/hdfs/lib/:/etc/hadoop-2.6 .0 / share / hadoop / hdfs / :/ etc / hadoop-2.6.0 / share / hadoop / yarn / lib / :/ etc / hadoop-2.6.0 / share / hadoop / yarn / :/ etc / hadoop-2.6.0 / share / hadoop / mapreduce / lib / :/ etc / hadoop-2.6.0 / share / hadoop / mapreduce / :/ etc / hadoop- 2.6.0 / contrib / capacity-scheduler / .jar       SPARK_YARN_STAGING_DIR-> *********(已编辑)       SPARK_USER-> *********(已编辑)       SPARK_CONF_DIR-> / etc / spark / conf       SPARK_HOME-> / etc / spark

     

命令:       {{JAVA_HOME}} / bin / java \         -服务器\         -Xmx1024m         -Djava.io.tmpdir = {{PWD}} / tmp \         '-Dspark.driver.port = 34872'\         -Dspark.yarn.app.container.log.dir = \         -XX:OnOutOfMemoryError ='杀死%p'\         org.apache.spark.executor.CoarseGrainedExecutorBackend \         --driver-url \         spark:// CoarseGrainedScheduler @ rhel7-cloudera-dev:34872 \         --executor-id \          \          - 主机名 \          \         -核心\         1 \         --app-id \         application_1566921956926_0010 \         --user-class-path \         文件:$ PWD / 应用 .jar \         1> / stdout \         2> / stderr

     

资源:        spark_libs ->资源{方案:“ hdfs”主机:“ rhel7-cloudera-dev”端口:9000文件:   “ /user/rhel/.sparkStaging/application_1566921956926_0010/spark_libs__5634501618166443611.zip”   }大小:232107209时间戳:1566933362350类型:归档可见性:   私人的       __spark_conf ->资源{方案:“ hdfs”主机:“ rhel7-cloudera-dev”端口:9000文件:   “ /user/rhel/.sparkStaging/application_1566921956926_0010/spark_conf.zip”   }大小:208377时间戳:1566933365411类型:归档可见性:   私人

     

================================================ ================================ 2019-08-28 00:46:22 INFO RMProxy:98-连接到ResourceManager   在/0.0.0.0:8030 2019-08-28 00:46:22 INFO YarnRMClient:54-   注册ApplicationMaster 2019-08-28 00:46:22 INFO   YarnAllocator:54-将请求2个执行者容器,每个容器1个   核心和1408 MB内存(包括384 MB的开销)2019-08-28   00:46:22 INFO YarnAllocator:54-提交了2个未本地化的容器   要求。 2019-08-28 00:46:22 INFO ApplicationMaster:54-已开始   进度报告程序线程具有(心跳:3000,初始分配:   200)时间间隔2019-08-28 00:46:22错误ApplicationMaster:43-   收到的信号条款2019-08-28 00:46:23 INFO ApplicationMaster:54-   最终应用状态:未定义,退出代码:16(原因:关机钩子)   在报告最终状态之前调用。)2019-08-28 00:46:23 INFO   ShutdownHookManager:54-名为Shutdown的钩子

     

容器:container_1566921956926_0010_02_000001 on   rhel7-cloudera-dev_33917   ================================================== ============================ LogType:stderr日志上传时间:2019年8月28日00:46:31 LogLength:3576日志   内容:SLF4J:类路径包含多个SLF4J绑定。 SLF4J:   发现于   [jar:file:/ yarn / local / usercache / rhel / filecache / 26 / __ spark_libs__5634501618166443611.zip/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]   SLF4J:找到绑定   [jar:文件:/etc/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]   SLF4J:请参见http://www.slf4j.org/codes.html#multiple_bindings   说明。 SLF4J:实际绑定类型   [org.slf4j.impl.Log4jLoggerFactory]线程“主”中的异常   java.io.IOException:发生本地异常失败:java.io.IOException;   主机详细信息:本地主机为:“ rhel7-cloudera-dev / 192.168.56.112”;   目标主机是:“ rhel7-cloudera-dev”:9000;在   org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)在   org.apache.hadoop.ipc.Client.call(Client.java:1474)在   org.apache.hadoop.ipc.Client.call(Client.java:1401)在   org.apache.hadoop.ipc.ProtobufRpcEngine $ Invoker.invoke(ProtobufRpcEngine.java:232)     在com.sun.proxy。$ Proxy9.getFileInfo(未知源)处   org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:752)     在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处   sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     在   sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     在java.lang.reflect.Method.invoke(Method.java:498)在   org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)     在   org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)     在com.sun.proxy。$ Proxy10.getFileInfo(未知源)处   org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1977)在   org.apache.hadoop.hdfs.DistributedFileSystem $ 18.doCall(DistributedFileSystem.java:1118)     在   org.apache.hadoop.hdfs.DistributedFileSystem $ 18.doCall(DistributedFileSystem.java:1114)     在   org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)     在   org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)     在   org.apache.spark.deploy.yarn.ApplicationMaster $$ anonfun $ 7 $$ anonfun $ apply $ 3.apply(ApplicationMaster.scala:235)     在   org.apache.spark.deploy.yarn.ApplicationMaster $$ anonfun $ 7 $$ anonfun $ apply $ 3.apply(ApplicationMaster.scala:232)     在scala.Option.foreach(Option.scala:257)在   org.apache.spark.deploy.yarn.ApplicationMaster $$ anonfun $ 7.apply(ApplicationMaster.scala:232)     在   org.apache.spark.deploy.yarn.ApplicationMaster $$ anonfun $ 7.apply(ApplicationMaster.scala:197)     在   org.apache.spark.deploy.yarn.ApplicationMaster $$ anon $ 5.run(ApplicationMaster.scala:800)     在java.security.AccessController.doPrivileged(本机方法)在   javax.security.auth.Subject.doAs(Subject.java:422)在   org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)     在   org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:799)     在   org.apache.spark.deploy.yarn.ApplicationMaster。(ApplicationMaster.scala:197)     在   org.apache.spark.deploy.yarn.ApplicationMaster $ .main(ApplicationMaster.scala:823)     在   org.apache.spark.deploy.yarn.ExecutorLauncher $ .main(ApplicationMaster.scala:854)     在   org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)   由以下原因引起:java.io.IOException   org.apache.hadoop.ipc.Client $ Connection.waitForWork(Client.java:935)     在org.apache.hadoop.ipc.Client $ Connection.run(Client.java:967)   原因:java.lang.InterruptedException ...另外2个

     

LogType:stdout日志上传时间:2019年8月28日00:46:31 LogLength:975日志   内容:2019-08-28 00:46:26 INFO SignalUtils:54-注册信号   TERM的处理程序2019-08-28 00:46:26 INFO SignalUtils:54-已注册   HUP的信号处理程序2019-08-28 00:46:26 INFO SignalUtils:54-   INT 2019-08-28 00:46:27 INFO的已注册信号处理程序   SecurityManager:54-将视图ACL更改为:yarn,rhel 2019-08-28   00:46:27 INFO SecurityManager:54-将修改ACL更改为:yarn,rhel   2019-08-28 00:46:27 INFO SecurityManager:54-更改视图ACL   组到:2019-08-28 00:46:27 INFO SecurityManager:54-更改   将ACL群组修改为:2019-08-28 00:46:27 INFO SecurityManager:54-   SecurityManager:身份验证已禁用; ui acls已禁用;使用者   具有查看权限:Set(yarn,rhel);具有查看权限的组:   组();具有修改权限的用户:Set(yarn,rhel);与   修改权限:Set()2019-08-28 00:46:28 INFO   ApplicationMaster:54-准备本地资源2019-08-28 00:46:28   错误ApplicationMaster:43-收到的信号条款

是否有解决此问题的建议?

0 个答案:

没有答案