我正在尝试在Spark-Submit上训练ALS算法,将模型保存在HDFS上,并使用Cassandra连接器在Cassandra上进行预测。
火车没有任何问题,但我有这个错误
java.io.InvalidClassException: scala.math.Ordering$$anonfun$by$1; local class incompatible: stream classdesc serialVersionUID = 3410834592477398573, local class serialVersionUID = 0
或者这个:
java.io.InvalidClassException: com.datastax.spark.connector.rdd.ReadConf; local class incompatible: stream classdesc serialVersionUID = 3166370729900710116, local class serialVersionUID = -7280317430045994784
似乎我有一些不兼容的库。我尝试通过build.sbt(后续)
的各种更改来克服这个问题val sparkVersion = "2.2.1"
assemblyMergeStrategy in assembly := {
case PathList("org","aopalliance", xs @ _*) => MergeStrategy.last
case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.first
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/MANIFEST.MF" => MergeStrategy.discard
case "META-INF/mailcap" => MergeStrategy.last
case PathList("META-INF", ps @ _*) => MergeStrategy.discard
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case "overview.html" => MergeStrategy.last // Added this for 2.1.0 I think
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
lazy val root = (project in file("."))
.settings(
name := "Perso_ALS",
version := "1.0",
scalaVersion := "2.11.4",
resolvers += "Apache Maven Central Repository" at "http://repo.maven.apache.org/maven2/",
libraryDependencies += "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly(),
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "2.0.6",
libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion exclude("org.xerial.snappy", "snappy-java"),
libraryDependencies += "org.apache.spark" %% "spark-catalyst" % sparkVersion,
libraryDependencies += "org.apache.spark" %% "spark-mllib-local" % sparkVersion,
libraryDependencies += "org.apache.spark" %% "spark-mllib" % sparkVersion,
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion,
libraryDependencies += "org.apache.logging.log4j" % "log4j-api" % "2.10.0",
libraryDependencies += "org.apache.logging.log4j" % "log4j-core" % "2.10.0",
libraryDependencies += "com.typesafe" % "config" % "1.3.3",
libraryDependencies += "org.xerial.snappy" % "snappy-java" % "1.1.7.1",
libraryDependencies += "io.netty" % "netty-all" % "4.0.43.Final",
libraryDependencies += "commons-net" % "commons-net" % "2.2",
libraryDependencies += "com.google.guava" % "guava" % "11.0.2",
libraryDependencies += "org.scala-lang" % "scala-library" % "2.11.4"
)
我还尝试添加行dependenciesOverride
但没有成功
有没有人曾经遇到过这种问题,如何解决? 我使用Spark 2.2.1和Scala 2.11(与Cassandra连接器兼容)有什么可以改变吗?
谢谢