如何使用pivot生成单行矩阵?

时间:2016-11-02 00:56:28

标签: apache-spark apache-spark-sql

我需要将以下两列数据帧转换为一行(长到宽)。

+--------+-----+
|   udate|   cc|
+--------+-----+
|20090622|  458|
|20090624|31068|
|20090626|  151|
|20090629|  148|
|20090914|  453|
+--------+-----+

我需要这种格式:

+--------+------------+----------+----------+
|   udate|   20090622 | 20090624 | 20090626 |
+--------+------------+----------+----------+
|     cc |         458|    31068 |      151 |etc

我跑了这个:

result_df.groupBy($"udate").pivot("udate").agg(max($"cc")).show()

但结果是所有行的矩阵都转换为所有列:

+--------+--------+--------+--------+--------+--------+---
|   udate|20090622|20090624|20090626|20090629|20090703|200
+--------+--------+--------+--------+--------+--------+---
|20090622|     458|    null|    null|    null|    null|   
|20090624|    null|   31068|    null|    null|    null|   
|20090626|    null|    null|     151|    null|    null|   
|20090629|    null|    null|    null|     148|    null|   
|20090703|    null|    null|    null|    null|     362|   
|20090704|    null|    null|    null|    null|    null|   
|20090715|    null|    null|    null|    null|    null|   
|20090718|    null|    null|    null|    null|    null|   
|20090721|    null|    null|    null|    null|    null|   
|20090722|    null|    null|    null|    null|    null|

我预计旋转单列数据集应该会产生一行旋转数据集。

如何修改pivot命令以便将结果集旋转到一行?

1 个答案:

答案 0 :(得分:2)

tl; dr 在Spark 2.4.0中,可以简单地归结为单独使用groupBy

val solution = d.groupBy().pivot("udate").agg(first("cc"))
scala> solution.show
+--------+--------+--------+--------+--------+
|20090622|20090624|20090626|20090629|20090914|
+--------+--------+--------+--------+--------+
|     458|   31068|     151|     148|     453|
+--------+--------+--------+--------+--------+

如果您真的需要第一列的名称,只需使用withColumn就可以了。

val betterSolution = solution.select(lit("cc") as "udate", $"*")
scala> betterSolution.show
+-----+--------+--------+--------+--------+--------+
|udate|20090622|20090624|20090626|20090629|20090914|
+-----+--------+--------+--------+--------+--------+
|   cc|     458|   31068|     151|     148|     453|
+-----+--------+--------+--------+--------+--------+