我使用spark 1.0.0。我执行此代码,然后获得以下异常。我已经发现异常是由JavaPairRDD的takeOrdered(int num,Comparator)方法引起的。我该如何解决这个问题?
Spark的Maven依赖:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.0.0</version>
</dependency>
这是我的代码。
SparkConf sparkConf = new SparkConf().setAppName(appName).setMaster(
master);
JavaSparkContext sc = new JavaSparkContext(sparkConf);
Configuration conf = HBaseConfiguration.create();
conf.set(TableInputFormat.INPUT_TABLE, tableName);
try {
HBaseAdmin admin = new HBaseAdmin(conf);
if (!admin.isTableAvailable(tableName)) {
HTableDescriptor tableDesc = new HTableDescriptor(
TableName.valueOf(tableName));
admin.createTable(tableDesc);
}
JavaPairRDD<ImmutableBytesWritable, Result> hBaseRDD = sc
.newAPIHadoopRDD(
conf,
TableInputFormat.class,
org.apache.hadoop.hbase.io.ImmutableBytesWritable.class,
org.apache.hadoop.hbase.client.Result.class);
JavaPairRDD<String, Integer> pairs = hBaseRDD
.mapToPair(new PairFunction<Tuple2<ImmutableBytesWritable, Result>, String, Integer>() {
/**
*
*/
private static final long serialVersionUID = -77767105936599216L;
@Override
public Tuple2<String, Integer> call(
Tuple2<ImmutableBytesWritable, Result> tuple)
throws Exception {
Result r = tuple._2;
String userId = new String(r.getRow());
int i = 0;
for (Cell c : r.rawCells())
if (compareDatesInMilis(now, c.getTimestamp()) <= numberOfDay)
i++;
return new Tuple2<String, Integer>(userId, i);
}
});
List<Tuple2<String, Integer>> l = pairs.takeOrdered(10,
new TupleComparator());
admin.close();
sc.stop();
} catch (MasterNotRunningException e) {
e.printStackTrace();
} catch (ZooKeeperConnectionException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
我得到了这个例外。
java.lang.NoSuchMethodError: com.google.common.collect.Ordering.leastOf(Ljava/util/Iterator;I)Ljava/util/List;
at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1.apply(RDD.scala:1043)
at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1.apply(RDD.scala:1040)
at org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:559)
at org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:559)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
2014-06-25 16:48:49,544 ERROR [Executor task launch worker-1] executor.ExecutorUncaughtExceptionHandler (Logging.scala:logError(95)) - Uncaught exception in thread Thread[Executor task launch worker-1,5,main]
java.lang.NoSuchMethodError: com.google.common.collect.Ordering.leastOf(Ljava/util/Iterator;I)Ljava/util/List;
at org.apache.spark.util.collection.Utils$.takeOrdered(Utils.scala:37)
at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1.apply(RDD.scala:1043)
at org.apache.spark.rdd.RDD$$anonfun$takeOrdered$1.apply(RDD.scala:1040)
at org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:559)
at org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:559)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:111)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
答案 0 :(得分:1)
我更改了我的spark maven依赖版本,问题解决了。这是我的新火花版本:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>0.9.1</version>
</dependency>
感谢。