Scala:获取错误 - mapPartitionsWithIndex不是org.apache.spark.rdd.RDD [Int]的成员

时间:2018-03-16 02:02:06

标签: scala apache-spark

我是Apache Spark和Scala的初学者。我在执行以下代码时遇到错误:

val z = sc.parallelize(List(1,2,3,4,5,6),2)

//print the content with partition labels
def myfunc(index: Int, iter: Iterator[(int)] ): Iterator[String] = {
    iter.toList.map( x => "[Part Id: " + index + " ,val:" + x + "]").iterator
}

z.mapPartitionsWithIndex(myfunc).collect
error : value mapPartitionsWithIndex is not a member of org.apache.spark.rdd.RDD[Int]

我想知道代码中有什么问题吗?可以解释一下吗? 提前致谢。

1 个答案:

答案 0 :(得分:1)

应该{​​{1}}而不是Iterator[Int]

Iterator[(Int)]

或者如下:

val z = sc.parallelize(List(1, 2, 3, 4, 5))

def func(index: Int, iter: Iterator[Int]): Iterator[String] = {
  iter.map(x => s"[Part ID: ${index}, val: ${x}]")
}

z.mapPartitionsWithIndex(func).collect()