我有RDD,我想循环它。我喜欢这个:
pointsMap.foreach({ p =>
val pointsWithCoordinatesWithDistance = pointsMap.leftOuterJoin(xCoordinatesWithDistance)
pointsWithCoordinatesWithDistance.foreach(println)
println("---")
})
但是,发生了NullPointerException:
java.lang.NullPointerException
at org.apache.spark.rdd.RDD.<init>(RDD.scala:125)
at org.apache.spark.rdd.CoGroupedRDD.<init>(CoGroupedRDD.scala:69)
at org.apache.spark.rdd.PairRDDFunctions.cogroup(PairRDDFunctions.scala:651)
at org.apache.spark.rdd.PairRDDFunctions.leftOuterJoin(PairRDDFunctions.scala:483)
at org.apache.spark.rdd.PairRDDFunctions.leftOuterJoin(PairRDDFunctions.scala:555)
...
{fore}之前初始化pointsMap
和xCoordinatesWithDistance
并包含元素。不在foreach循环中leftOuterJoin
也可以。有关我的代码的完整版本,请参阅https://github.com/timasjov/spark-learning/blob/master/src/DBSCAN.scala
答案 0 :(得分:2)
不要在某些RDD运算符的函数中使用RDD。当您想要同时操作多个RDD时,需要使用正确的RDD运算符,例如join
。