Question

我的csv文件包含城市的天气信息。一行有很多列（超过1200）的时间。例如，它看起来像，

id  city_name  dt_0        temp_0  hum_0  dt_1        temp_1  hum_1  dt_2        temp_2  hum_2
1   Boston     2017110306  23.5    54.0   2017110310  21.4    40.0   2017110314  22.2    52.1
2   Seattle    2017110306  20.4    60.0   2017110310  18.4    42.0   2017110314  18.3    50.5

架构对我没用。所以我想通过python Pandas DataFrame转换它。我想要的是它看起来像，

id  city_name  dt          temp  hum
1   Boston     2017110306  23.5  54.0
1   Boston     2017110310  21.4  40.0
1   Boston     2017110314  22.2  52.1
2   Seattle    2017110306  20.4  60.0
2   Seattle    2017110310  18.4  42.0
2   Seattle    2017110314  18.3  50.5

怎么做？

Answer 1

首先set_index，然后使用MultiIndex创建split，最后按stack重新塑造：

df = df.set_index(['id','city_name'])
df.columns = df.columns.str.split('_', expand=True)
df = df.stack().reset_index(level=2, drop=True).reset_index()
print (df)
   id city_name          dt   hum  temp
0   1    Boston  2017110306  54.0  23.5
1   1    Boston  2017110310  40.0  21.4
2   1    Boston  2017110314  52.1  22.2
3   2   Seattle  2017110306  60.0  20.4
4   2   Seattle  2017110310  42.0  18.4
5   2   Seattle  2017110314  50.5  18.3

Pandas DataFrame将列更改为行

1 个答案: