I have a table that looks like the following:
ID Type 5m 10m 15m
1 A 3 9 13
1 B 7 8 22
1 C 5 11 13
2 A 1 3 20
2 B 16 17 30
...
If possible, I would like to create new columns in the following format:
ID A_5m A_10m A_15m B_5m B_10m B_15m C_5m C_10m C_15m
I am currently referencing the following SO: How to transpose/pivot the rows data to column in Spark Scala?
Its good for creating the new columns: A B
, but I am lost when it comes to creating the types plus the distance.
Any ideas?
答案 0 :(得分:0)
它有助于创建新列:A B,但在创建类型和距离时我迷失了。
没有什么不同。您可以在单个数据透视中应用多个聚合:
val df = Seq(
(1, "A", 3 , 9 , 13), (1, "B", 7 , 8 , 22),(1, "C", 5 , 11, 13),
(2, "A", 1 , 3 , 20), (2, "B", 16, 17, 30)
).toDF("id", "type", "5m", "10m", "15m")
df.groupBy("id").pivot("type").agg(
first("5m") as "5m", first("10m") as "10m", first("15m") as "15m"
).show
+---+----+-----+-----+----+-----+-----+----+-----+-----+
| id|A_5m|A_10m|A_15m|B_5m|B_10m|B_15m|C_5m|C_10m|C_15m|
+---+----+-----+-----+----+-----+-----+----+-----+-----+
| 1| 3| 9| 13| 7| 8| 22| 5| 11| 13|
| 2| 1| 3| 20| 16| 17| 30|null| null| null|
+---+----+-----+-----+----+-----+-----+----+-----+-----+
Spark会根据基本名称和级别自动生成名称。