我想将 ShapeLogic Scala 与 Spark 结合使用。我正在使用 Scala 2.11.8 , Spark 2.1.1 和 ShapeLogic Scala 0.9.0 。 我成功地导入了类以使用Spark管理图像。此外,我成功编译并打包(使用 SBT )以下应用程序,以便将其提交给集群。
以下应用程序只需打开图像并将其写入文件夹:
// imageTest.scala
import org.apache.spark.sql.SparkSession
import org.shapelogic.sc.io.LoadImage
import org.shapelogic.sc.image.BufferImage
import org.shapelogic.sc.io.BufferedImageConverter
object imageTestObj {
def main(args: Array[String]) {
// Create a Scala Spark Session
val spark = SparkSession.builder().appName("imageTest").master("local").getOrCreate();
val inPathStr = "/home/vitrion/IdeaProjects/imageTest";
val outPathStr = "/home/vitrion/IdeaProjects/imageTest/output";
val empty = new BufferImage[Byte](0, 0, 0, Array());
var a = Array.fill(3)(empty);
for (i <- 0 to 3) {
val imagePath = inPathStr + "IMG_" + "%01d".format(i + 1);
a(i) = LoadImage.loadBufferImage(inPathStr);
}
val sc = spark.sparkContext;
val imgRDD = sc.parallelize(a);
imgRDD.map { outBufferImage =>
val imageOpt = BufferedImageConverter.bufferImage2AwtBufferedImage(outBufferImage)
imageOpt match {
case Some(bufferedImage) => {
LoadImage.saveAWTBufferedImage(bufferedImage, "png", outPathStr)
println("Saved " + outPathStr)
}
case None => {
println("Could not convert image")
}
}
}
}
}
这是我的SBT文件
name := "imageTest"
version := "0.1"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.1.1" % "provided",
"org.apache.spark" % "spark-sql_2.11" % "2.1.1" % "provided",
"org.shapelogicscala" %% "shapelogic" % "0.9.0" % "provided"
)
但是,出现以下错误。执行包 SBT命令时,似乎ShapeLogic Scala依赖项未包含在应用程序JAR中:
[vitrion@mstr scala-2.11]$ pwd
/home/vitrion/IdeaProjects/imageTest/target/scala-2.11
[vitrion@mstr scala-2.11]$ ls
classes imagetest_2.11-0.1.jar resolution-cache
[vitrion@mstr scala-2.11]$ spark-submit --class imageTestObj imagetest_2.11-0.1.jar
Exception in thread "main" java.lang.NoClassDefFoundError: org/shapelogic/sc/image/BufferImage
at imageTestObj.main(imageTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: org.shapelogic.sc.image.BufferImage
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 10 more
我希望有人可以帮我解决一下吗? 非常感谢你
答案 0 :(得分:1)
这个错误说明了一切:
Caused by: java.lang.ClassNotFoundException: org.shapelogic.sc.image.BufferImage
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
添加此缺少的依赖项(ShapeLogic类)------&gt; org.shapelogic.sc.image.BufferImage,应该解决这个问题。如果你错过了这种依赖,Maven或SBT都应该给出同样的错误!!
由于您正在使用群集模式,因此可以使用--jars
直接在spark-submit上添加依赖项,请按照此post了解详细信息。
答案 1 :(得分:0)
默认情况下,在提交给spark的jar中,sbt文件中列出的依赖项不会包含在内,因此对于sbt,你必须使用一个插件来构建一个包含 shapelogicscala 类的超级/胖子jar 。您可以在SO How to build an Uber JAR (Fat JAR) using SBT within IntelliJ IDEA?上使用此链接,了解如何使用sbt管理此内容。