我正在尝试构建一个超级jar,所以我可以部署我的Spark程序:
执行命令
sbt assembly
这会输出很多错误:
[error] deduplicate: different file contents found in the following:
[error] /Users/samibadawi/.ivy2/cache/commons-collections/commons-collections/jars/commons-collections-3.2.1.jar:org/apache/commons/collections/FastHashMap$CollectionView$CollectionViewIterator.class
[error] /Users/samibadawi/.ivy2/cache/commons-beanutils/commons-beanutils/jars/commons-beanutils-1.7.0.jar:org/apache/commons/collections/FastHashMap$CollectionView$CollectionViewIterator.class
有关Scala 2.10的问题的答案不起作用: spark + sbt-assembly: "deduplicate: different file contents found in the following"
经过多次黑客攻击后,我得到了一个hello world项目,没有任何有用的代码可以使用下面的build.sbt文件进行编译:
似乎是随机的,排除什么和合并策略。有没有更简单更系统的方法来做到这一点?
(除了使用: " org.apache.spark" %%" spark-core" %sparkVersion%"提供", 在这种情况下,没有部署依赖项。)
build.sbt摘录:
import sbtassembly.AssemblyPlugin._
//Define dependencies. These ones are only required for Test and Integration Test scopes.
libraryDependencies ++= Seq(
("org.apache.spark" %% "spark-core" % sparkVersion).
exclude("commons-beanutils", "commons-beanutils-core").
exclude("commons-collections", "commons-collections").
exclude("commons-logging", "commons-logging").
exclude("com.esotericsoftware.minlog", "minlog").
exclude("com.codahale.metrics", "metrics-core").
exclude("aopalliance","aopalliance")
,
"org.scalatest" %% "scalatest" % "2.2.4" % "test,it"
)
mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
{
case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
case PathList("org", "apache", xs @ _*) => MergeStrategy.last
case PathList("com", "google", xs @ _*) => MergeStrategy.last
case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
case "about.html" => MergeStrategy.rename
case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
case "META-INF/mailcap" => MergeStrategy.last
case "META-INF/mimetypes.default" => MergeStrategy.last
case "plugin.properties" => MergeStrategy.last
case "log4j.properties" => MergeStrategy.last
case x => old(x)
}
}
Project.inConfig(Test)(assemblySettings)
答案 0 :(得分:0)
多做了一点错误并制作了一个适用于我的真实程序的build.sbt:
我遇到的一个问题是Postgres的jar版本重复问题。 我通过评论这些依赖关系来解决这个问题:
// "org.postgresql" % "postgresql" % "9.4.1212", //Small gap between Doobie and Spark dependency
// "org.postgis" % "postgis-jdbc" % "1.3.3", //Causes conflicts
我还没有开始使用PostGIS,它依赖于postgresql-8.3-603.jdbc4.jar
我不得不把Postgres直接依赖。
从工作build.sbt:
val doobieVersion = "0.4.1"
libraryDependencies ++= Seq(
"ch.qos.logback" % "logback-classic" % "1.0.13", //comment and warning go away
"ch.qos.logback" % "logback-core" % "1.0.13",
"com.citymaps" % "tile-library" % "1.4",
"com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.7.2",
"com.github.scopt" %% "scopt" % "3.5.0",
"com.typesafe.play" %% "play-json" % "2.5.9",
"org.apache.spark" %% "spark-core" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"org.apache.spark" %% "spark-mllib" % sparkVersion % "provided",
"graphframes" % "graphframes" % "0.3.0-spark2.0-s_2.11",
"org.clapper" %% "grizzled-slf4j" % "1.3.0",
// "org.postgresql" % "postgresql" % "9.4.1212", //Small gap between Doobie and Spark dependency
// "org.postgis" % "postgis-jdbc" % "1.3.3", //Causes conflicts
"org.scalatest" %% "scalatest" % "3.0.0" % "test" withSources() withJavadoc(),
"org.spire-math" %% "spire" % "0.11.0",
"org.tpolecat" %% "doobie-core-cats" % doobieVersion,
"org.tpolecat" %% "doobie-postgres-cats" % doobieVersion
)
运行后
sbt clean
这停止了工作。 事实证明postgis-jdbc存在冲突,最后一个版本是2.2.1,但普通Maven存储库上可用的最后一个版本是1.3.3,并且依赖于旧的Postgres驱动程序jar。
看了很多回购,找不到postgis-jdbc 2.2.1。
下载2.2.1版本 https://github.com/postgis/postgis-java
此版本的版本设置为2.2.2SNAPSHOT。因此,请更改pom.xml和jdbc / pom.xml中的版本号
使用此命令构建jar。关于Maven版本很挑剔:
/usr/local/Cellar/maven/3.3.9/bin/mvn install
现在包含此依赖项
resolvers ++= Seq(
Resolver.mavenLocal
"net.postgis" % "postgis-jdbc" % "2.2.1",
然后运行
sbt assembly
终于奏效了。