我是Apache Spark和Scala的初学者。我在执行以下代码时遇到错误:
val z = sc.parallelize(List(1,2,3,4,5,6),2)
//print the content with partition labels
def myfunc(index: Int, iter: Iterator[(int)] ): Iterator[String] = {
iter.toList.map( x => "[Part Id: " + index + " ,val:" + x + "]").iterator
}
z.mapPartitionsWithIndex(myfunc).collect
error : value mapPartitionsWithIndex is not a member of org.apache.spark.rdd.RDD[Int]
我想知道代码中有什么问题吗?可以解释一下吗? 提前致谢。
答案 0 :(得分:1)
应该{{1}}而不是Iterator[Int]
。
Iterator[(Int)]
或者如下:
val z = sc.parallelize(List(1, 2, 3, 4, 5))
def func(index: Int, iter: Iterator[Int]): Iterator[String] = {
iter.map(x => s"[Part ID: ${index}, val: ${x}]")
}
z.mapPartitionsWithIndex(func).collect()