如何解决java.io.InvalidClassException:本地类与带有ALS的Scala Spark不兼容

时间:2018-05-04 10:58:57

标签: java scala apache-spark cassandra spark-submit

我正在尝试在Spark-Submit上训练ALS算法,将模型保存在HDFS上,并使用Cassandra连接器在Cassandra上进行预测。

火车没有任何问题,但我有这个错误

java.io.InvalidClassException: scala.math.Ordering$$anonfun$by$1; local class incompatible: stream classdesc serialVersionUID = 3410834592477398573, local class serialVersionUID = 0

或者这个:

 java.io.InvalidClassException: com.datastax.spark.connector.rdd.ReadConf; local class incompatible: stream classdesc serialVersionUID = 3166370729900710116, local class serialVersionUID = -7280317430045994784

似乎我有一些不兼容的库。我尝试通过build.sbt(后续)

的各种更改来克服这个问题
val sparkVersion = "2.2.1"
assemblyMergeStrategy in assembly := {
  case PathList("org","aopalliance", xs @ _*) => MergeStrategy.last
  case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
  case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
  case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
  case PathList("org", "apache", xs @ _*) => MergeStrategy.first
  case PathList("com", "google", xs @ _*) => MergeStrategy.last
  case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
  case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
  case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
  case "about.html" => MergeStrategy.rename
  case "META-INF/MANIFEST.MF" => MergeStrategy.discard
  case "META-INF/mailcap" => MergeStrategy.last
  case PathList("META-INF", ps @ _*) => MergeStrategy.discard
  case "plugin.properties" => MergeStrategy.last
  case "log4j.properties" => MergeStrategy.last
  case "overview.html" => MergeStrategy.last  // Added this for 2.1.0 I think
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}


lazy val root = (project in file("."))
  .settings(
    name := "Perso_ALS",
    version := "1.0",
    scalaVersion := "2.11.4",
    resolvers += "Apache Maven Central Repository" at "http://repo.maven.apache.org/maven2/",
    libraryDependencies += "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(),
    libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "2.0.6",
    libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion exclude("org.xerial.snappy", "snappy-java"),
    libraryDependencies += "org.apache.spark" %% "spark-catalyst" % sparkVersion,
    libraryDependencies += "org.apache.spark" %% "spark-mllib-local" % sparkVersion,    
    libraryDependencies += "org.apache.spark" %% "spark-mllib" % sparkVersion, 
    libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion,
    libraryDependencies += "org.apache.logging.log4j" % "log4j-api" % "2.10.0",
    libraryDependencies += "org.apache.logging.log4j" % "log4j-core" % "2.10.0",  
    libraryDependencies += "com.typesafe" % "config" % "1.3.3",
    libraryDependencies += "org.xerial.snappy" % "snappy-java" % "1.1.7.1",
    libraryDependencies += "io.netty" % "netty-all" % "4.0.43.Final",
    libraryDependencies += "commons-net" % "commons-net" % "2.2",
    libraryDependencies += "com.google.guava" % "guava" % "11.0.2",
    libraryDependencies += "org.scala-lang" % "scala-library" %  "2.11.4"
  )

我还尝试添加行dependenciesOverride但没有成功

有没有人曾经遇到过这种问题,如何解决? 我使用Spark 2.2.1和Scala 2.11(与Cassandra连接器兼容)有什么可以改变吗?

谢谢

0 个答案:

没有答案