其他人在关于从Scala-Spark快速入门指南创建规范的简单应用程序应用程序之前发布了一些问题。我无法将创建的jar包含到我尝试创建的类中,或者无法识别我正在尝试导入的Spark Context对象。我尝试过的所有内容都来自于对Stack Overflow上的Spark Scala进行类似查询的建议。
在我安装Spark的目录中,我根据Apache Spark网站上的说明和一般的sbt指南设置了以下目录结构:
simple.sbt
\src
\src\main\scala\
\src\main\scala\SimpleApp.scala
这是SimpleApp.scala文件的代码。
/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val logFile = "/Users/sidneygivens/spark-1.4.1/README.md"
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
这是simple.sbt文件的结构:
name := "Simple Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1"
resolvers += "bintray-spark-packages" at "https://dl.bintray.com/spark- packages/maven/"
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
当我从安装了spark的目录运行sbt包时,我收到错误消息,指出SparkContext的“import”命令有问题。
[error] /Users/sidneygivens/src/main/scala/SimpleApp.scala:4: object apache is not a member of package org
[error] import org.apache.spark.SparkContext
如果我转到src / main目录,我可以使用sbt package命令创建一个jar。但是,它不是创建SimpleApp类。
在此问题上发布的先前问题中,有人建议应运行此命令以查找类名。
jar tf main_2.10-0.1-SNAPSHOT.jar
这就是所有返回的内容。
META-INF/MANIFEST.MF
没有按照.jar构建预期的类存在。
正如所料,尝试从我设法创建的.jar文件的Apache Spark快速入门指南运行建议的spark-submit命令的变体不起作用。
/Users/sidneygivens/spark-1.4.1/bin/spark-submit \
--class "SimpleApp" \
--master local[1] \
/Users/sidneygivens/spark-1.4.1/src/main/target/scala-2.10/main_2.10-0.1-SNAPSHOT.jar
给我一个ClassNotFound异常:
java.lang.ClassNotFoundException: SimpleApp
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
我尝试的所有东西似乎都符合Spark / sbt样板,但我显然做错了。应该从哪里运行sbt pcakage?我的Spark / sbt设置有问题吗?
这是sbt包调用的完整输出:
[info] Set current project to sparkscala (in build file:/Users/sidneygivens/spark-1.4.1/sparkscala/)
[info] Compiling 1 Scala source to /Users/sidneygivens/spark- 1.4.1/sparkscala/target/scala-2.10/classes...
[error] /Users/sidneygivens/spark-1.4.1/sparkscala/src/main/scala/SimpleApp.scala:4: object apache is not a member of package org
[error] import org.apache.spark._
[error] ^
[error] /Users/sidneygivens/spark- 1.4.1/sparkscala/src/main/scala/SimpleApp.scala:10: not found: type SparkConf
[error] val conf = new SparkConf().setAppName("Simple App")
[error] ^
[error] /Users/sidneygivens/spark-1.4.1/sparkscala/src/main/scala/SimpleApp.scala:11: not found: type SparkContext
[error] val sc = new SparkContext(conf)
[error] ^
[error] three errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 2 s, completed Aug 23, 2015 11:30:22 PM