我已经使用纱线群集模式提交了火花流媒体作业。
但是我收到以下错误。
SparkSubmit命令:
export SPARK_CLASSPATH=/usr/hdp/current/hbase-client/lib/hbase-common.jar:/usr/hdp/current/hbase-client/lib/hbase-client.jar:/usr/hdp/current/hbase-client/lib/hbase-server.jar:/usr/hdp/current/hbase-client/lib/hbase-protocol.jar:/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar:/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar
spark-submit --master yarn-cluster --keytab /etc/security/keytabs/srvc_egsc_hdpuser.service.keytab --principal srvc_egsc_hdpuser@EAPKDC.HOUSTON.HP.COM --queue sc_streaming --class com.reni.scmplatform.data.producer.DPMain --executor-memory 5g --driver-memory 8g --conf spark.sql.shuffle.partitions=10 --conf spark.default.parallelism=50 --jars /usr/hdp/current/hbase-client/lib/hbase-common.jar,/usr/hdp/current/hbase-client/lib/hbase-client.jar,/usr/hdp/current/hbase-client/lib/hbase-server.jar,/usr/hdp/current/hbase-client/lib/hbase-protocol.jar,/usr/hdp/current/hbase-client/lib/guava-12.0.1.jar,/usr/hdp/current/hbase-client/lib/htrace-core-3.1.0-incubating.jar --files /etc/spark/conf/hbase-site.xml,/etc/spark/conf/hive-site.xml hdfs://EAPROD/EA/supplychain/streaming/logistics/entaly/jars/DataProducer-assembly-1.0.15-SNAPSHOT.jar --platform.framework.hdfs.logging.dir=/EA/supplychain/process/logs/logistics/entaly/dataProducer --platform.framework.logging.level=info --platform.framework.logging.publish=true
错误:
18/03/12 05:14:30 ERROR ApplicationMaster: User class threw exception: org.apache.spark.SparkException: Exception when registering SparkListener
org.apache.spark.SparkException: Exception when registering SparkListener
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2154)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:578)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2280)
at org.apache.spark.streaming.StreamingContext.<init>(StreamingContext.scala:140)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:877)
at org.apache.spark.streaming.StreamingContext$$anonfun$getOrCreate$1.apply(StreamingContext.scala:877)
at scala.Option.map(Option.scala:145)
at org.apache.spark.streaming.StreamingContext$.getOrCreate(StreamingContext.scala:877)
at com.reni.scmplatform.data.producer.helper.DPStreamEventHandler.start(DPStreamEventHandler.scala:63)
at com.reni.scmplatform.data.producer.DPMain$.main(DPMain.scala:27)
at com.reni.scmplatform.data.producer.DPMain.main(DPMain.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:561)
Caused by: java.lang.ClassNotFoundException: com.pepperdata.spark.metrics.PepperdataSparkListener
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:175)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2122)
at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2119)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:34)
at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2119)
... 15 more
18/03/12 05:14:30 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
18/03/12 05:14:30 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
18/03/12 05:14:30 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: org.apache.spark.SparkException: Exception when registering SparkListener)
答案 0 :(得分:0)
您应该使用--jars
选项将包含缺失类的JAR添加到作业类路径中(请参阅此答案:spark submit add multiple jars in classpath)
此外,我使用sbt-assembly
插件为您处理这些事情:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
然后使用sbt compile assemble
进行构建,您的应用程序所需的所有jar都将包含在发送给Yarn的作业jar中。