使用.reduceByKey()的Apache Spark NoSuchMethodError

时间:2016-10-25 07:55:01

标签: scala hadoop apache-spark

我试图在Apache Spark中运行一些示例以了解有关它的更多信息,但是当我尝试这样做时(在spark-shell中)我收到了错误:

java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.addDeprecations([Lorg/apache/hadoop/conf/Configuration$DeprecationDelta;)V

完整执行和错误跟踪。我希望你能帮助我。

pcitbu@pcitbumint /usr/spark/spark-2.0.1-bin-hadoop2.7 $ spark-shell
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/10/25 09:52:38 WARN SparkConf: 
SPARK_WORKER_INSTANCES was detected (set to '3').
This is deprecated in Spark 1.0+.

Please instead use:
 - ./spark-submit with --num-executors to specify the number of executors
 - Or set SPARK_EXECUTOR_INSTANCES
 - spark.executor.instances to configure the number of instances in the spark config.

16/10/25 09:52:38 WARN Utils: Your hostname, pcitbumint resolves to a loopback address: 127.0.1.1; using 192.168.0.119 instead (on interface ens33)
16/10/25 09:52:38 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/10/25 09:52:38 WARN SparkContext: Use an existing SparkContext, some configuration may not take effect.
Spark context Web UI available at http://192.168.0.119:4040
Spark context available as 'sc' (master = local[*], app id = local-1477381958561).
Spark session available as 'spark'.
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /___/ .__/\_,_/_/ /_/\_\   version 2.0.1
      /_/

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_101)
Type in expressions to have them evaluated.
Type :help for more information.

scala> val file = sc.textFile("README.md")
file: org.apache.spark.rdd.RDD[String] = README.md MapPartitionsRDD[1] at textFile at <console>:24

scala> val counts = file.flatMap(line => line.split(" "))
counts: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at flatMap at <console>:26

scala> .map(word => (word, 1))
res0: org.apache.spark.rdd.RDD[(String, Int)] = MapPartitionsRDD[3] at map at <console>:29

scala> .reduceByKey(_ + _)
java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.addDeprecations([Lorg/apache/hadoop/conf/Configuration$DeprecationDelta;)V
  at org.apache.hadoop.hdfs.HdfsConfiguration.addDeprecatedKeys(HdfsConfiguration.java:66)
  at org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:31)
  at org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:116)
  at java.lang.Class.forName0(Native Method)
  at java.lang.Class.forName(Class.java:348)
  at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:810)
  at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:855)
  at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1440)
  at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
  at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
  at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
  at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:563)
  at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
  at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:291)
  at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$29.apply(SparkContext.scala:992)
  at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$29.apply(SparkContext.scala:992)
  at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
  at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
  at scala.Option.map(Option.scala:146)
  at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
  at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
  at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:248)
  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:246)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.rdd.RDD.partitions(RDD.scala:246)
  at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)
  at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:328)
  at org.apache.spark.rdd.PairRDDFunctions$$anonfun$reduceByKey$3.apply(PairRDDFunctions.scala:328)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:358)
  at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:327)
  ... 48 elided

1 个答案:

答案 0 :(得分:0)

请检查你的hadoop版本。更具体地说,是hadoop-common-x.x.x.jar的版本

几点:

  1. 您是否自己构建了这个火花源,如果是,请检查您构建的hadoop版本。
  2. 如果您有预建版本,请在安装目录中查看hadoop-common的版本,即/usr/spark/spark-2.0.1-bin-hadoop2.7