找不到Spark文件。在本地模式下工作正常但在群集模式下失败

时间:2015-09-25 15:48:19

标签: apache-spark

大家好,                         我有简单的spark应用程序,其中我有很少的spring上下文和规则xml文件。所有这些文件都是项目的一部分,位于资源文件夹(reource \ db \ rule \ rule2.xml)下,并且在spark本地模式下工作正常。当我在纱线群集模式下运行相同的应用程序时,它抱怨找不到文件rule2.xml并且它的部分Maven构建了jar。我是否需要为群集模式指定不同格式的文件?我是否需要对应用程序进行任何更改才能在群集模式下工作?任何帮助将不胜感激

以下是我正在阅读xml文件的代码

 JaxbUtils.unmarshalRule(
            ByteStreams.toByteArray(
            Resources.getResource(String.format("db/rule/rule2.xml", id)).openStream()));

Here is the error log

/24 15:57:07 INFO storage.BlockManager: Registering executor with local external shuffle service.
15/09/24 15:57:07 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@bdaolc011node08.sabre.com:40589/user/HeartbeatReceiver
15/09/24 15:57:09 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/09/24 15:57:09 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/09/24 15:57:09 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/09/24 15:57:09 INFO storage.MemoryStore: ensureFreeSpace(3132) called with curMem=0, maxMem=555755765
15/09/24 15:57:09 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.1 KB, free 530.0 MB)
15/09/24 15:57:09 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/09/24 15:57:09 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 134 ms
15/09/24 15:57:09 INFO storage.MemoryStore: ensureFreeSpace(6144) called with curMem=3132, maxMem=555755765
15/09/24 15:57:09 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.0 KB, free 530.0 MB)
15/09/24 15:57:12 INFO support.ClassPathXmlApplicationContext: Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@3c6db742: startup date [Thu Sep 24 15:57:12 CDT 2015]; root of context hierarchy
15/09/24 15:57:12 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from class path resource [spring/rules-engine-spring.xml]
15/09/24 15:57:13 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from class path resource [spring/ere-spring.xml]
15/09/24 15:57:13 INFO support.DefaultListableBeanFactory: Overriding bean definition for bean 'nativeRuleBuilder': replacing [Generic bean: class [com.sabre.sp.ere.core.loader.DroolsNativeRuleBuilder]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/ere-spring.xml]] with [Generic bean: class [com.sabre.sp.ere.core.loader.DroolsNativeRuleBuilder]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/rules-engine-spring.xml]]
15/09/24 15:57:13 INFO support.DefaultListableBeanFactory: Overriding bean definition for bean 'rulesExecutor': replacing [Generic bean: class [com.sabre.sp.ere.core.executor.DroolsRulesExecutor]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/ere-spring.xml]] with [Generic bean: class [com.sabre.sp.ere.core.executor.DroolsRulesExecutor]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/rules-engine-spring.xml]]
15/09/24 15:57:13 INFO support.PropertySourcesPlaceholderConfigurer: Loading properties file from class path resource [spring/ere-test.properties]
15/09/24 15:57:13 WARN support.PropertySourcesPlaceholderConfigurer: Could not load properties from class path resource [spring/ere-test.properties]: class path resource [spring/ere-test.properties] cannot be opened because it does not exist
15/09/24 15:57:13 INFO support.PropertySourcesPlaceholderConfigurer: Loading properties file from class path resource [spring/ere-spring.properties]
15/09/24 15:57:13 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
15/09/24 15:57:13 INFO jdbc.JDBCRDD: closed connection
java.lang.IllegalArgumentException: resource spring/rule2.xml not found.
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at com.google.common.io.Resources.getResource(Resources.java:152)
at com.sabre.rules.AppRuleExecutor.rule(AppRuleExecutor.java:50)
at com.sabre.rules.AppRuleExecutor.executeRules(AppRuleExecutor.java:39)
at com.sabre.rules.RuleComponent.executeRules(RuleComponent.java:43)
at com.sabre.rules.SMAAlertImpl$1.call(SMAAlertImpl.java:60)
at com.sabre.rules.SMAAlertImpl$1.call(SMAAlertImpl.java:37)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:143)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:143)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

1 个答案:

答案 0 :(得分:0)

假设您的代码位于main()中,如下所示:

  

JaxbUtils.unmarshalRule(               ByteStreams.toByteArray(               Resources.getResource(String.format(args [0],id))。openStream()));

因此,您需要调用以下代码:

  

spark-submit --master yarn --deploy-mode cluster --files db / rule / rule2.xml mySample.jar rule2.xml

它做什么?它告诉yarn以集群模式部署代码/ jar并将“--files”中指定的文件保存在容器上。在群集模式下,可以在群集中的任何一个节点中创建容器。但是,一旦在“--files”中指定了它,就需要仅通过其名称引用它,而不是指定完全限定的路径。