我有一个火花数据框,可以有重复的列,具有不同的行值,是否可以合并这些重复的列并获得没有任何重复列的数据框
示例:
|name |upload| name| upload1|
| null| null|alice| 101|
| null| null| bob| 231|
|alice| 100| null| null|
| bob| 23| null| null|
应该成为 -
|name |upload| upload1|
| alice| null| 101|
| bob | null| 231|
|alice| 100| null|
| bob| 23| null|
答案 0 :(得分:0)
val DF1 = Seq(
(None, None, Some("alice"), Some(101)),
(None, None, Some("bob"), Some(231)),
(Some("alice"), Some(100), None, None),
(Some("bob"), Some(23), None, None)).
toDF("name","upload", "name1", "upload1")
DF1.withColumn("name", coalesce($"name", $"name1")).drop("name1").show
+-----+------+-------+
| name|upload|upload1|
+-----+------+-------+
|alice| null| 101|
| bob| null| 231|
|alice| 100| null|
| bob| 23| null|
+-----+------+-------+