我有一个像下面这样的数据框
Index Toronto_2000 Toronto_2001 Toronto_2002 Toronto_2003 Montreal_2000 Montreal_2001 Montreal_2002 Montreal_2003
ID:1012 100 98 102 105 101 104 108 110
我该如何计算每年的变化百分比以及每个城市的变化百分比?
答案 0 :(得分:1)
我建议先用str.split
将unstack
重塑DataFrame
,然后将groupby
与pct_change
一起使用:
df.columns = df.columns.str.split('_', expand=True)
df = df.unstack().reset_index()
df.columns = ['city','year','index','val']
print (df)
city year index val
0 Toronto 2000 ID:1012 100
1 Toronto 2001 ID:1012 98
2 Toronto 2002 ID:1012 102
3 Toronto 2003 ID:1012 105
4 Montreal 2000 ID:1012 101
5 Montreal 2001 ID:1012 104
6 Montreal 2002 ID:1012 108
7 Montreal 2003 ID:1012 110
df['pct'] = df.groupby('city')['val'].apply(lambda x: x.pct_change())
print (df)
city year index val pct
0 Toronto 2000 ID:1012 100 NaN
1 Toronto 2001 ID:1012 98 -0.020000
2 Toronto 2002 ID:1012 102 0.040816
3 Toronto 2003 ID:1012 105 0.029412
4 Montreal 2000 ID:1012 101 NaN
5 Montreal 2001 ID:1012 104 0.029703
6 Montreal 2002 ID:1012 108 0.038462
7 Montreal 2003 ID:1012 110 0.018519
答案 1 :(得分:0)
这是很长的路要走,也许专业人士会为您提供漂亮的衬里纸
df2 = df.T.reset_index()
df2['city'] = [x[0] for x in df2['index'].str.split('_')]
df2['year'] = [x[1] for x in df2['index'].str.split('_')]
df2['pct'] = df2.sort_values('year').groupby(['city'])['ID:1012'].pct_change()
Index index ID:1012 city year pct
0 Toronto_2000 100 Toronto 2000 NaN
1 Toronto_2001 98 Toronto 2001 -0.029703
2 Toronto_2002 102 Toronto 2002 -0.019231
3 Toronto_2003 105 Toronto 2003 -0.027778
4 Montreal_2000 101 Montreal 2000 0.010000
5 Montreal_2001 104 Montreal 2001 0.061224
6 Montreal_2002 108 Montreal 2002 0.058824
7 Montreal_2003 110 Montreal 2003 0.047619