这是我的代码。试图计算地图功能内的范围权重。每个权重都存储在CountsAsMap中。
val countsAsMap:Map[Int,Int] = counts.collectAsMap
// countsAsMap: scala.collection.Map[Int,Int] = Map(137 -> 91, 146 -> 83, 218 -> 26, 227 -> 16, ...)
var rangeMatrix = MutableList[(Int, Int)]()
for( i:Int <- min to max;
j:Int <- min to max) {
if (i <=j) {
rangeMatrix += ((i, j))
}
}
// rangeMatrix : ((301,301), (300,301), (300,300), (299,301), (299,300), ...)
// Creating parallelizable RDD for rangeMatrix
var matrixRDD = sc.parallelize(rangeMatrix)
val rangeWeight = matrixRDD.map(r => {
var total = 0
for( k <- r._1 to r._2) {
total = total + countsAsMap(k)
}
total
})
rangeWeight.take(1).foreach(println)
跑步时出错。尝试了多种方法但最终都出现了以下异常
java.lang.StackOverflowError
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1108)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
注意:如果我使用集合(List)进行转换(使用相同的地图),它的工作正常,而不是RDD。