我尝试在用Scala编写的纱线群集上运行一个火花作业,并遇到这个错误:
[!@#$% spark-1.0.0-bin-hadoop2]$ export HADOOP_CONF_DIR="/etc/hadoop/conf"
[!@#$% spark-1.0.0-bin-hadoop2]$ ./bin/spark-submit --class "SimpleAPP" \
> --master yarn-client \
> test_proj/target/scala-2.10/simple-project_2.10-0.1.jar
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.ClassNotFoundException: SimpleAPP
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:289)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
这是我的sbt文件:
[!@#$% test_proj]$ cat simple.sbt
name := "Simple Project"
version := "0.1"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.0.0"
// We need to be able to write Avro in Parquet
// libraryDependencies += "com.twitter" % "parquet-avro" % "1.3.2"
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
这是我的SimpleApp.scala程序,它是规范的:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp{
def main(args: Array[String]) {
val logFile = "/home/myname/spark-1.0.0-bin-hadoop2/README.md" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
sbt包如下:
[!@#$% test_proj]$ sbt package
[info] Set current project to Simple Project (in build file:/home/myname/spark-1.0.0-bin-hadoop2/test_proj/)
[info] Compiling 1 Scala source to /home/myname/spark-1.0.0-bin-hadoop2/test_proj/target/scala-2.10/classes...
[info] Packaging /home/myname/spark-1.0.0-bin-hadoop2/test_proj/target/scala-2.10/simple-project_2.10-0.1.jar ...
[info] Done packaging.
[success] Total time: 12 s, completed Mar 3, 2015 10:57:12 PM
根据建议,我做了以下事项:
jar tf simple-project_2.10-0.1.jar | grep .class
显示的内容如下:
SimpleApp$$anonfun$1.class
SimpleApp$.class
SimpleApp$$anonfun$2.class
SimpleApp.class
答案 0 :(得分:1)
验证jar中的名称是否为SimpleAPP。
这样做:
jar tf simple-project_2.10-0.1.jar | grep .class
并检查班级名称是否正确。