我正在尝试运行twitter分类器 - https://github.com/databricks/reference-apps。它使用Spark来分析这些Feed。 我已经在IntelliJ上加载了项目,并且在外部库中显示的依赖项 org.apache.spark.mllib 之一无法正常工作。
我在运行它时遇到java.lang.NoClassDefFoundError。但依赖性已经存在。
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/mllib/feature/HashingTF
at com.databricks.apps.twitter_classifier.Utils$.<init>(Utils.scala:12)
at com.databricks.apps.twitter_classifier.Utils$.<clinit>(Utils.scala)
at com.databricks.apps.twitter_classifier.Collect$.main(Collect.scala:26)
at com.databricks.apps.twitter_classifier.Collect.main(Collect.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.mllib.feature.HashingTF
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 9 more
我的 build.sbt 如下所示:
import AssemblyKeys._
name := "spark-twitter-lang-classifier"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.1.0" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.1.0" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.1.0" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.1.0" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0"
libraryDependencies += "com.google.code.gson" % "gson" % "2.3"
libraryDependencies += "org.twitter4j" % "twitter4j-core" % "3.0.3"
libraryDependencies += "commons-cli" % "commons-cli" % "1.2"
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
assemblySettings
mergeStrategy in assembly := {
case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard
case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => MergeStrategy.discard
case "log4j.properties" => MergeStrategy.discard
case m if m.toLowerCase.startsWith("meta-inf/services/") => MergeStrategy.filterDistinctLines
case "reference.conf" => MergeStrategy.concat
case _ => MergeStrategy.first
}
答案 0 :(得分:0)
我认为您必须将mllib依赖项更改为范围运行时。在maven中,您可以添加运行时标记来定义对运行时的依赖范围。