我希望转置一个小的数据框,以使列变为行
例如,假设我有一个像这样的数据框
+---+---+------+
| id|obs|period|
+---+---+------+
| 1|230| CURR|
| 2|456| PREV|
+---+---+------+
I would like to have
+---------+-----+----+
|COL_NAME | CURR|PREV|
+---------+-----+----+
|id | 1 | 2 |
|obs | 230|456 |
+---------|-----|----+
任何帮助,我们将不胜感激。我最接近的是来自网络的
from pyspark.sql import functions as func
#Use `create_map` to create the map of columns with constant
df = df.withColumn('mapCol', \
func.create_map(func.lit('period'),df.period,
func.lit('col_2'),df.id,
func.lit('col_3'),df.obs
)
)
#Use explode function to explode the map
res = df.select(func.explode(df.mapCol).alias('col_id','col_value'))
res.show()
+------+---------+
|col_id|col_value|
+------+---------+
|period| CURR|
| col_2| 1|
| col_3| 230|
|period| PREV|
| col_2| 2|
| col_3| 456|
+------+---------+
答案 0 :(得分:0)
这是我想出的答案,感谢所有尝试提供帮助的人。
spark.sql("select 'ID' as COL_NAME ,max(case when period = 'CURR' then id end) as CURR, \
max(case when period = 'PREV' then id end) as PREV from df union \
select 'OBS' ,max(case when period = 'CURR' then obs end),max(case when period = 'PREV' then obs end) from df")\
.show()
+--------+----+----+
|COL_NAME|CURR|PREV|
+--------+----+----+
| ID | 1| 2|
| OBS | 230| 456|
+--------+----+----+