我正在尝试运行自己的spark应用程序,但是当我使用spark-submit命令时,我收到此错误:
Users/_name_here/dev/sp/target/scala-2.10/sp_2.10-0.1-SNAPSHOT.jar --stacktrace
java.lang.ClassNotFoundException: /Users/_name_here/dev/sp/mo/src/main/scala/MySimpleApp
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:340)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:633)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我正在使用以下命令:
/Users/_name_here/dev/spark/bin/spark-submit
--class "/Users/_name_here/dev/sp/mo/src/main/scala/MySimpleApp"
--master local[4] /Users/_name_here/dev/sp/target/scala-2.10/sp_2.10-0.1-SNAPSHOT.jar
我的build.sb看起来像这样:
name := "mo"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.10" % "1.4.0",
"org.postgresql" % "postgresql" % "9.4-1201-jdbc41",
"org.apache.spark" % "spark-sql_2.10" % "1.4.0",
"org.apache.spark" % "spark-mllib_2.10" % "1.4.0",
"org.tachyonproject" % "tachyon-client" % "0.6.4",
"org.postgresql" % "postgresql" % "9.4-1201-jdbc41",
"org.apache.spark" % "spark-hive_2.10" % "1.4.0",
"com.typesafe" % "config" % "1.2.1"
)
resolvers += "Typesafe Repo" at "http://repo.typesafe.com/typesafe/releases/"
我的plugin.sbt:
logLevel := Level.Warn
resolvers += "Sonatype snapshots" at "https://oss.sonatype.org/content/repositories/snapshots/"
addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.6.0")
addSbtPlugin("com.eed3si9n" % "sbt-assembly" %"0.11.2")
我正在使用spark.apache.org中的prebuild包。我通过brew和scala安装了sbt。从spark根文件夹运行sbt包工作正常并且它创建了jar但是使用程序集根本不起作用,可能是因为它在重建spark文件夹中丢失了。我会感激任何帮助,因为我很新兴。哦,btw spark在intelliJ中运行良好
答案 0 :(得分:8)
您不应该通过其目录路径引用您的类,而是通过其包路径引用您的类。例如:
/Users/_name_here/dev/spark/bin/spark-submit
--master local[4]
--class com.example.MySimpleApp /Users/_name_here/dev/sp/target/scala-2.10/sp_2.10-0.1-SNAPSHOT.jar
从我看到你在任何软件包中都没有MySimpleApp,所以只需“--class MySimpleApp”即可。
答案 1 :(得分:0)
显然,我的项目结构一般都有问题。因为我用sbt和sublime创建了一个新项目,现在我可以使用spark-submit。但这真的很奇怪,因为我没有改变任何智能提供的sbt项目的默认结构。现在这是一个像魅力一样的项目结构:
{{1}}
感谢您的帮助!
答案 2 :(得分:0)
The problem is because you are input the incorrect --class arguments
If you are used Java maven project make sure you have input the correct class path --class "/Users/_name_here/dev/sp/mo/src/main/scala/MySimpleApp" it should likes com.example.myclass this format. Aldo could be --class myclass
Here have a lot of example about Spark submit.
./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master local[8] \ /path/to/examples.jar \ 100
./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://207.184.161.138:7077 \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000
./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://207.184.161.138:7077 \ --deploy-mode cluster \ --supervise \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000
export HADOOP_CONF_DIR=XXX ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master yarn \ --deploy-mode cluster \ # can be client for client mode --executor-memory 20G \ --num-executors 50 \ /path/to/examples.jar \ 1000
./bin/spark-submit \ --master spark://207.184.161.138:7077 \ examples/src/main/python/pi.py \ 1000
./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master mesos://207.184.161.138:7077 \ --deploy-mode cluster \ --supervise \ --executor-memory 20G \ --total-executor-cores 100 \ http://path/to/examples.jar \ 1000