我的Spark工作失败并出现以下异常,我无法找出导致失败的要求缺失:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:472)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6$$anonfun$apply$3.apply(Client.scala:470)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:470)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$6.apply(Client.scala:468)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:468)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:727)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:142)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1021)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1081)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:742)
Spark-submit命令:
spark-submit --conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:/xyz/conf/log4j.xml \
-DHOME=/xyz/transformation -DENV=e1 \
-DJOB=xformation --conf spark.local.dir=/warehouse/tmp/spark1489619325 \
--queue dev --master yarn --deploy-mode cluster \
--properties-file /xyz/conf/job.conf \
--files /xyz/conf/e1.properties --class TransformationJob /xyz/job.jar
同一程序与master作为本地程序可以正常工作。
任何建议都会有很大的帮助。提前谢谢。
答案 0 :(得分:0)
我在classpath中有一个巨大的罐子列表,使用' - jars'元素和罐子之一是罪魁祸首,当我从' - jars'问题得到解决,我仍然不确定为什么spark-submit因为那个jar而失败。
答案 1 :(得分:0)
我遇到了由数据文件引起的错误。 我指向了错误的火车数据方向。
训练数据和测试数据不匹配。
训练数据:0,1 0 0 1 0 0 1 0 0 1
测试数据:1, 2, 0, 0, 10
我修正了train数据源方向,问题解决了。
答案 2 :(得分:-1)
我收到了类似的错误:
Exception in thread "main" java.lang.IllegalArgumentException: requirement failed
at scala.Predef$.require(Predef.scala:221)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:501)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8$$anonfun$apply$5.apply(Client.scala:499)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8.apply(Client.scala:499)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$8.apply(Client.scala:497)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:497)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:763)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:143)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1109)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1169)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
解决方法是:
在终端或日志上,您将有一条WARN行,如下所示:
WARN客户端:资源文件:多次添加到分布式缓存中。
只需从spark提交脚本中删除这个额外的jar。希望这对所有人都有帮助。