我使用的是HDP 2.6。我下载了最新版本的Spark(2.2.1)并使用spark-submit
我试图运行我的jar(使用相同版本的Spark作为程序集构建)。但是,我收到了错误:
Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
我的$HADOOP_CONF_DIR
是/etc/hadoop/conf
,链接到/usr/hdp/current/hadoop-client/conf
yarn-site.xml
中的yarn.application.classpath
包含条目:/usr/hdp/current/hadoop-yarn-client/*
此目录包含jar hadoop-yarn-common.jar
,其中包含类org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
。这就是为什么不了解将要发生的事情。
我根据以下建议进行了此检查: link 1 link 2 Bellow full stacktrace如果有用的话:
[root@omm101 bin]# pwd
/opt/spark_2.2.1/spark-2.2.1-bin-hadoop2.7/bin
[root@omm101 bin]# echo $HADOOP_CONF_DIR
/etc/hadoop/conf
[root@omm101 bin]# ./spark-submit --class net.atos.ooredooom.reportengine.trareport.TraReport --master yarn --deploy-mode client --num-executors 18 --executor-cores 5 --executor-memory 15g --driver-memory 1g /root/jars/report-compute-engine.jar
18/01/25 17:22:50 INFO spark.SparkContext: Running Spark version 2.2.1
18/01/25 17:22:50 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/01/25 17:22:51 INFO spark.SparkContext: Submitted application: TRA-Report
18/01/25 17:22:51 INFO spark.SecurityManager: Changing view acls to: root
18/01/25 17:22:51 INFO spark.SecurityManager: Changing modify acls to: root
18/01/25 17:22:51 INFO spark.SecurityManager: Changing view acls groups to:
18/01/25 17:22:51 INFO spark.SecurityManager: Changing modify acls groups to:
18/01/25 17:22:51 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
18/01/25 17:22:51 INFO util.Utils: Successfully started service 'sparkDriver' on port 41613.
18/01/25 17:22:51 INFO spark.SparkEnv: Registering MapOutputTracker
18/01/25 17:22:51 INFO spark.SparkEnv: Registering BlockManagerMaster
18/01/25 17:22:51 INFO storage.BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
18/01/25 17:22:51 INFO storage.BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
18/01/25 17:22:51 INFO storage.DiskBlockManager: Created local directory at /tmp/blockmgr-27408430-08bf-4252-b320-e68e6d103154
18/01/25 17:22:51 INFO memory.MemoryStore: MemoryStore started with capacity 366.3 MB
18/01/25 17:22:51 INFO spark.SparkEnv: Registering OutputCommitCoordinator
18/01/25 17:22:51 INFO util.log: Logging initialized @1743ms
18/01/25 17:22:51 INFO server.Server: jetty-9.3.z-SNAPSHOT
18/01/25 17:22:51 INFO server.Server: Started @1814ms
18/01/25 17:22:51 INFO server.AbstractConnector: Started ServerConnector@c835d12{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
18/01/25 17:22:51 INFO util.Utils: Successfully started service 'SparkUI' on port 4040.
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@2a2bb0eb{/jobs,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@58783f6c{/jobs/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@512d92b{/jobs/job,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1bc53649{/jobs/job/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@47d93e0d{/stages,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@751e664e{/stages/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@182b435b{/stages/stage,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3704122f{/stages/stage/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@60afd40d{/stages/pool,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3f2049b6{/stages/pool/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@ea27e34{/storage,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@e72dba7{/storage/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1dfd5f51{/storage/rdd,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@24855019{/storage/rdd/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4d4d8fcf{/environment,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6f0628de{/environment/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@1e392345{/executors,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4ced35ed{/executors/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7bd69e82{/executors/threadDump,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@51b01960{/executors/threadDump/json,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@27dc79f7{/static,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@7674a051{/,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@6754ef00{/api,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@3301500b{/jobs/job/kill,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@15deb1dc{/stages/stage/kill,null,AVAILABLE,@Spark}
18/01/25 17:22:51 INFO ui.SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.4.110.24:4040
18/01/25 17:22:51 INFO spark.SparkContext: Added JAR file:/root/jars/report-compute-engine.jar at spark://10.4.110.24:41613/jars/report-compute-engine.jar with timestamp 1516886571716
18/01/25 17:22:52 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
18/01/25 17:22:52 INFO impl.TimelineClientImpl: Timeline service address: http://omm103.in.nawras.com.om:8188/ws/v1/timeline/
18/01/25 17:22:52 INFO service.AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state STARTED; cause: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)
at net.atos.ooredooom.reportengine.trareport.TraReport$.main(TraReport.scala:14)
at net.atos.ooredooom.reportengine.trareport.TraReport.main(TraReport.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
... 25 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 26 more
18/01/25 17:22:52 ERROR spark.SparkContext: Error initializing SparkContext.
java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)
at net.atos.ooredooom.reportengine.trareport.TraReport$.main(TraReport.scala:14)
at net.atos.ooredooom.reportengine.trareport.TraReport.main(TraReport.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
... 25 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 26 more
18/01/25 17:22:52 INFO server.AbstractConnector: Stopped Spark@c835d12{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
18/01/25 17:22:52 INFO ui.SparkUI: Stopped Spark web UI at http://10.4.110.24:4040
18/01/25 17:22:52 WARN cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to request executors before the AM has registered!
18/01/25 17:22:52 INFO cluster.YarnClientSchedulerBackend: Stopped
18/01/25 17:22:52 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
18/01/25 17:22:52 INFO memory.MemoryStore: MemoryStore cleared
18/01/25 17:22:52 INFO storage.BlockManager: BlockManager stopped
18/01/25 17:22:52 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
18/01/25 17:22:52 WARN metrics.MetricsSystem: Stopping a MetricsSystem that is not running
18/01/25 17:22:52 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
18/01/25 17:22:52 INFO spark.SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2227)
at org.apache.hadoop.yarn.client.RMProxy.createRMFailoverProxyProvider(RMProxy.java:161)
at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:94)
at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:72)
at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:187)
at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:153)
at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56)
at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:173)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2516)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:918)
at org.apache.spark.sql.SparkSession$Builder$$anonfun$6.apply(SparkSession.scala:910)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:910)
at net.atos.ooredooom.reportengine.trareport.TraReport$.main(TraReport.scala:14)
at net.atos.ooredooom.reportengine.trareport.TraReport.main(TraReport.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2219)
... 25 more
Caused by: java.lang.ClassNotFoundException: Class org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider not found
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
... 26 more
18/01/25 17:22:52 INFO util.ShutdownHookManager: Shutdown hook called
18/01/25 17:22:52 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-8ea65e11-e6d7-480c-874b-e35071fa6d7f
非常感谢任何支持/提示。
修改
由于缺乏更好的想法,我spark-submit
添加了--jars /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-common-2.7.3.2.6.1.0-129.jar
,这导致:IllegalAccessError
:尝试访问方法org.apache.hadoop.yarn.clientRMFailoverProxyProvider.getProxyInternal()
当我没有工作时,我试图将hadoop-yarn-common-2.7.3.2.6.1.0-129.jar
放入.../<spark_dir>/jars
我得到了同样的结果。
所以基本上问题是为什么spark没有使用hdp jar。它应该使用它(因为当我强制使用这个lib时,我看到这个IllegalAccessError
)。在Spark 2.2.1中有一个jar hadoop-yarn-common-2.7.3.jar
,但这个jar不包含RequestHedgingRMFailoverProxyProvider
(可能是特定于HDP的?)
答案 0 :(得分:0)
HDP默认配置使用org.apache.hadoop.yarn.client.RequestHedgingRMFailoverProxyProvider
来实现。但是,Spark不附带此隐含功能。但另一个叫ConfiguredRMFailoverProxyProvider
。它们都实现了接口RMFailoverProxyProvider
。 (请参阅此文档:https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/apidocs/org/apache/hadoop/yarn/client/RMFailoverProxyProvider.html)
因此,要解决此问题,只需遵循此说明并更改HDP上的配置即可。 https://community.hortonworks.com/content/supportkb/178800/errorclass-orgapachehadoopyarnclientrequesthedging.html