我试图理解为什么在执行这段代码时不会引发序列化问题。
dstream.foreachRDD{rdd =>
rdd.cache()
val alternatives = restServer.get(“/v1/alternatives”).toSet
alternatives.foreach{alternative =>
val filteredRDD = rdd.filter(element => element.kind == alternative)
val formatter = new Formatter(alternative)
val recordRDD = filteredRDD.map(element => formatter(element))
recordRDD.foreachPartition{partition =>
val conn = DB.connect(server)
partition.foreach(element => conn.insert(alternative, element)
}
}
rdd.unpersist(true)
}
因此,例如,在执行rdd.filter(element => element.kind == alternative)
时是否不应该引发闭包的序列化问题,因为restServer
是不可序列化的,并且在闭包的范围内,我们传递给过滤器操作?同样,根据同样的逻辑,rdd本身也将成为关闭filter方法的范围。