我正在尝试使用sbt-pack编译和组装scala项目,这是构建sbt文件,这是plugins.sbt。
该项目有两个子项目,common和main,common有这样的结构:
MacBook-Pro-Retina-de-Alonso:common aironman$ ls src/main/scala/common/utils/cassandra/
CassandraConnectionUri.scala Helper.scala Pillar.scala
主要有这个:
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ ls main/src/main/scala/
com common
在com文件夹中,我有这个文件夹:
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ ls main/src/main/scala/com/databricks/apps/twitter_classifier/
Collect.scala ExamineAndTrain.scala Predict.scala Utils.scala
Collect,ExamineAndTrain和Predict是具有主要功能的对象。
在普通文件夹中,我有这个:
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ ls main/src/main/scala/common/utils/cassandra/
CassandraMain.scala
项目编译,我可以打包它在目标文件夹下生成一些文件夹。这是set of folders和/ target / pack / lib下的库。
当我尝试运行生成的命令时会发生问题:
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ target/pack/bin/collect /tmp/tweets 10000 10 1
Initializing Streaming Spark Context...
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/03/15 11:02:54 INFO SparkContext: Running Spark version 1.4.0
Exception in thread "main" java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class
at org.apache.spark.util.TimeStampedWeakValueHashMap.<init>(TimeStampedWeakValueHashMap.scala:42)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:277)
at com.databricks.apps.twitter_classifier.Collect$.main(Collect.scala:37)
at com.databricks.apps.twitter_classifier.Collect.main(Collect.scala)
Caused by: java.lang.ClassNotFoundException: scala.collection.GenTraversableOnce$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 4 more
16/03/15 11:02:54 INFO Utils: Shutdown hook called
Spark几乎能够启动,但最终崩溃了。 我知道这里最可能的问题是一些库是针对scala 2.10编译的,而其他库是针对2.11编译的,但是,在我的build.sbt文件中,我在val commonSettings中提供scalaVersion:=“2.10.4”并且设置了val sparkDependencies为了不使用%提供的%,我可以看到用2.10编译的每个火花jar文件:
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ ls target/pack/lib/spark*
target/pack/lib/spark-catalyst_2.10-1.4.0.jar
target/pack/lib/spark-network-shuffle_2.10-1.4.0.jar
target/pack/lib/spark-core_2.10-1.4.0.jar
target/pack/lib/spark-sql_2.10-1.4.0.jar
target/pack/lib/spark-graphx_2.10-1.4.0.jar
target/pack/lib/spark-streaming-twitter_2.10-1.4.0.jar
target/pack/lib/spark-launcher_2.10-1.4.0.jar
target/pack/lib/spark-streaming_2.10-1.4.0.jar
target/pack/lib/spark-mllib_2.10-1.4.0.jar
target/pack/lib/spark-twitter-lang-classifier-using-cassandra_2.10-0.1-SNAPSHOT.jar
target/pack/lib/spark-network-common_2.10-1.4.0.jar
target/pack/lib/spark-unsafe_2.10-1.4.0.jar
更新
我可以在lib文件夹中看到这个scala库:
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ ls target/pack/lib/scala*
**target/pack/lib/scala-async_2.11-0.9.1.jar
target/pack/lib/scala-library-2.11.1.jar**
target/pack/lib/scalap-2.10.0.jar
target/pack/lib/scala-compiler-2.10.4.jar
target/pack/lib/scala-reflect-2.10.4.jar
现在的问题是,理论上我正在使用2.10.4进行编译,那么,scala-library-2.11.1和scala-async_2.11-0.9.1.jar如何在lib文件夹中?
我如何强制使用正确的版本?
更新2
问题与“com.chrisomeara”%“pillar_2.11”%“2.0.1”的错误版本有关。正确的版本是“com.chrisomeara”%“pillar_2.10”%“2.0.1”
此设置是正确的,现在我可以看到正确的scala库:
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ ls target/pack/lib/scala*
target/pack/lib/scala-async_2.10-0.9.1.jar
target/pack/lib/scala-library-2.10.6.jar
target/pack/lib/scalap-2.10.0.jar
target/pack/lib/scala-compiler-2.10.4.jar
target/pack/lib/scala-reflect-2.10.4.jar
现在还有另外一个例外,但我认为这是另一个与此无关的问题,所以谢谢Yuval。
MacBook-Pro-Retina-de-Alonso:my-twitter-cassandra-app aironman$ target/pack/bin/collect /tmp/tweets 10000 10 1
Initializing Streaming Spark Context...
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/03/15 11:39:12 INFO SparkContext: Running Spark version 1.4.0
16/03/15 11:39:12 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/03/15 11:39:12 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:368)
at com.databricks.apps.twitter_classifier.Collect$.main(Collect.scala:37)
at com.databricks.apps.twitter_classifier.Collect.main(Collect.scala)
16/03/15 11:39:12 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" org.apache.spark.SparkException: A master URL must be set in your configuration
at org.apache.spark.SparkContext.<init>(SparkContext.scala:368)
at com.databricks.apps.twitter_classifier.Collect$.main(Collect.scala:37)
at com.databricks.apps.twitter_classifier.Collect.main(Collect.scala)
16/03/15 11:39:12 INFO Utils: Shutdown hook called