mllib 1.1.0中的NoClassDefError

时间:2014-12-22 08:46:07

标签: intellij-idea apache-spark apache-spark-mllib

我正在尝试运行twitter分类器 - https://github.com/databricks/reference-apps。它使用Spark来分析这些Feed。 我已经在IntelliJ上加载了项目,并且在外部库中显示的依赖项 org.apache.spark.mllib 之一无法正常工作。

Dependency in IntelliJ IDEA

我在运行它时遇到java.lang.NoClassDefFoundError。但依赖性已经存在。

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/mllib/feature/HashingTF
  at com.databricks.apps.twitter_classifier.Utils$.<init>(Utils.scala:12)
  at com.databricks.apps.twitter_classifier.Utils$.<clinit>(Utils.scala)
  at com.databricks.apps.twitter_classifier.Collect$.main(Collect.scala:26)
  at com.databricks.apps.twitter_classifier.Collect.main(Collect.scala)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  at java.lang.reflect.Method.invoke(Method.java:483)
  at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.mllib.feature.HashingTF
  at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
  at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  ... 9 more

我的 build.sbt 如下所示:

import AssemblyKeys._

name := "spark-twitter-lang-classifier"

version := "1.0"

scalaVersion := "2.10.4"

libraryDependencies += "org.apache.spark" %% "spark-core" % "1.1.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.1.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.1.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-streaming" % "1.1.0" % "provided"

libraryDependencies += "org.apache.spark" %% "spark-streaming-twitter" % "1.1.0"

libraryDependencies += "com.google.code.gson" % "gson" % "2.3"

libraryDependencies += "org.twitter4j" % "twitter4j-core" % "3.0.3"

libraryDependencies += "commons-cli" % "commons-cli" % "1.2"

resolvers += "Akka Repository" at "http://repo.akka.io/releases/"

assemblySettings

mergeStrategy in assembly := {
  case m if m.toLowerCase.endsWith("manifest.mf")          => MergeStrategy.discard
  case m if m.toLowerCase.matches("meta-inf.*\\.sf$")      => MergeStrategy.discard
  case "log4j.properties"                                  => MergeStrategy.discard
  case m if m.toLowerCase.startsWith("meta-inf/services/") => MergeStrategy.filterDistinctLines
  case "reference.conf"                                    => MergeStrategy.concat
  case _                                                   => MergeStrategy.first
}

1 个答案:

答案 0 :(得分:0)

我认为您必须将mllib依赖项更改为范围运行时。在maven中,您可以添加运行时标记来定义对运行时的依赖范围。