将RDD [(Int,Int)]转换为scala中的PairRDD

时间:2018-02-23 11:23:28

标签: scala apache-spark

这个例子有什么问题?

val f = sc.parallelize(Array((1,1),(1,2)))
val p = new org.apache.spark.rdd.PairRDDFunctions[Int,Int](f)

Name: Compile Error
Message:  error: type mismatch;
 found   : org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.RDD[(Int, Int)]
 required: org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.org.apache.spark.rdd.RDD[(Int, Int)]
       val p = new org.apache.spark.rdd.PairRDDFunctions[Int,Int](f)
                                                                  ^

1 个答案:

答案 0 :(得分:1)

您的代码似乎在Spark 2.2.0上运行良好。

这是Spark版本2.2.0中控制台命令的记录:

scala> val f = sc.parallelize(Array((1,1),(1,2)))
f: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[0] at parallelize at <console>:24

scala> val p = new org.apache.spark.rdd.PairRDDFunctions[Int,Int](f)
p: org.apache.spark.rdd.PairRDDFunctions[Int,Int] = org.apache.spark.rdd.PairRDDFunctions@6e1d939e

scala> p
res0: org.apache.spark.rdd.PairRDDFunctions[Int,Int] = org.apache.spark.rdd.PairRDDFunctions@6e1d939e

scala> f
res1: org.apache.spark.rdd.RDD[(Int, Int)] = ParallelCollectionRDD[0] at parallelize at <console>:24

使用Scala版本2.11.8(OpenJDK 64位服务器VM,Java 1.8.0_131)

这似乎是旧版本中的一个错误。