非规范化地图类型

时间:2016-03-29 10:18:31

标签: apache-spark

鉴于此系列:

("user1",Map("Gobelin" -> "2","Archers" ->"3"))
("user2",Map("Giant" -> "1"))

我想要一个看起来像这样的输出

("user1","Gobelin","2")
("user1","Archers","3")
("user2","Giant","1")

我如何用Spark实现这个目标?

1 个答案:

答案 0 :(得分:1)

您很可能正在寻找flatMapValues

val rdd  = sc.parallelize(
  ("user1",Map("Gobelin" -> "2","Archers" ->"3")) :: 
  ("user2",Map("Giant" -> "1")) :: Nil)

rdd.flatMapValues(identity[Map[String, String]])

explode

rdd.toDF.select($"_1", explode($"_2")).as[(String, String, String)].rdd