熊猫:如何在数据框架中添加年度平均值?

时间:2016-03-04 16:22:39

标签: python pandas

year  x             y
1987  1.609438      0
1988  1.386294      0
1989  1.098612      1
1987  0.693147      0
1988  0.000000      0
1989 -0.693147      1

...

所以,我可以逐年得到x的平均值

>>> df.groupby(['year'])['x','y'].mean()
   x     y
year     meanX     meanY                    
1987     0.597434  0.000000
1988     0.428441  0.351852
1989     0.155169  0.185185

如何添加将每行与年份相关联的新列?我的意思是我想要这样的东西:

year  x             y   meanX   meanY
1987  1.609438      0   0.597434  0.000000
1988  1.386294      0   0.428441  0.351852
1989  1.098612      1   0.155169  0.185185
1987  0.693147      0   0.597434  0.000000
1988  0.000000      0   0.428441  0.351852
1989 -0.693147      1   0.155169  0.185185

这样做的正确方法是什么?

2 个答案:

答案 0 :(得分:1)

df['x_mean'] = df.groupby('year').x.transform(lambda s: s.mean())
df['y_mean'] = df.groupby('year').y.transform(lambda s: s.mean())

>>> df
   year         x  y    x_mean  y_mean
0  1987  1.609438  0  1.151293       0
1  1988  1.386294  0  0.693147       0
2  1989  1.098612  1  0.202733       1
3  1987  0.693147  0  1.151293       0
4  1988  0.000000  0  0.693147       0
5  1989 -0.693147  1  0.202733       1

答案 1 :(得分:0)

pandas.DataFrame.merge应该做你想做的事:

data =  [
  {'year': 1987, 'x': 1.5116, 'y': 0},
  {'year': 1988, 'x': 1.135, 'y': 1}
]
means = df.groupby(['year'])['x', 'y'].mean()
df.merge(right=means, left_on='year', right_index=True, suffixes=('', 'mean'))

返回:

        x  y  year   xmean  ymean
0  1.5116  0  1987  1.5116      0
1  1.1350  1  1988  1.1350      1