我目前正在使用scala并想知道我们是否可以将不同的列合并为一个? 例如,如果我得到:
+------+--------+-------+----------+-----+
| User | family | phone | location | raz |
+------+--------+-------+----------+-----+
| u1 | f1 | p1 | l1 | r1 |
+------+--------+-------+----------+-----+
| u2 | f2 | p2 | l2 | r2 |
+------+--------+-------+----------+-----+
| u3 | f3 | p3 | l3 | r3 |
+------+--------+-------+----------+-----+
如何将手机,位置和raz合并为1列,每个列的值在不同的行上?
| User | family | new |
+------+--------+-------+
| u1 | f1 | p1 |
+------+--------+-------+
| u1 | f1 | l1 |
+------+--------+-------+
| u1 | f1 | r1 |
+------+--------+-------+
| u2 | f2 | p2 |
+------+--------+-------+
| u2 | f2 | l2 |
+------+--------+-------+
| u2 | f2 | r2 |
+------+--------+-------+
| u3 | f3 | p3 |
+------+--------+-------+
| u3 | f3 | l3 |
+------+--------+-------+
| u3 | f3 | r3 |
+------+--------+-------+
由于
答案 0 :(得分:0)
一种方法是将这些列展平为array
列并explode
列:
val df = Seq(
("u1", "f1", "p1", "l1", "r1"),
("u2", "f2", "p2", "l2", "r2"),
("u3", "f3", "p3", "l3", "r3")
).toDF("User", "family", "phone", "location", "raz")
val df2 = df.
withColumn("plr", array($"phone", $"location", $"raz")).
withColumn("new", explode($"plr")).
select("User", "family", "new")
df2.show
+----+------+---+
|User|family|new|
+----+------+---+
| u1| f1| p1|
| u1| f1| l1|
| u1| f1| r1|
| u2| f2| p2|
| u2| f2| l2|
| u2| f2| r2|
| u3| f3| p3|
| u3| f3| l3|
| u3| f3| r3|
+----+------+---+