Spark:如何将RDD [(Long,Iterable [String])]转换为RDD [(Long,String)]?

时间:2016-06-30 20:12:20

标签: scala apache-spark

如何将此RDD [(Long,Iterable [String])]转换为...

(852403,Set(PT0000094043, PT0000097083, PT0000036162))
(357331,Set(PT0000068829, PT0000094042, PT0000066859))

RDD [(Long,String)]是这样的吗?

(852403, PT0000094043)
(852403, PT0000097083)
(852403, PT0000036162)
(357331, PT0000068829)
(357331, PT0000094042)
(357331, PT0000066859)

1 个答案:

答案 0 :(得分:2)

尝试flatMapValues

rdd.flatMapValues(identity)

flatMap

rdd.flatMap{ case (k, vs) => vs.map(v => (k, v)) }