Python pandas滚动意味着没有修复窗口数

时间:2017-03-01 15:43:31

标签: python pandas dataframe mean

我希望添加两列[ std_dev 意思],其中平均值的样本会随着特定位置的日期继续而展开。

location   date              temp    std_dev    mean
NY         2014-02-01        60      
NY         2014-02-02        55      
NY         2014-02-03        70      
NY         2014-02-04        80      
LA         2014-02-01        80      
LA         2014-02-02        85      
LA         2014-02-03        75       

我发现了一篇解释滚动均值/标准的帖子,我可以将它应用到表格中。但是我收到 std_dev 的错误,因为该位置的大小不是固定值。如何在不修复的情况下引用窗口大小?

pandas rolling on a shifted dataframe

df['mean'] = df.groupby('location')['temp'].apply(pd.rolling_mean,4,min_periods=2).shift(1)

df['std_dev'] = df.groupby('location')['temp'].apply(pd.rolling_std,4,min_periods=2).shift(1)

任何帮助都非常感谢!

1 个答案:

答案 0 :(得分:2)

我认为您正在寻找expanding,例如

>>> df
   temp location
0    60       NY
1    55       NY
2    70       NY
3    80       NY
4    80       LA
5    85       LA
6    75       LA

>>> expander = df.groupby('location').temp.expanding(min_periods=2)

>>> orderify = lambda x: x.reset_index(level=0, drop=True).sort_index()

>>> df['mean'], df['std'] = map(orderify, [expander.mean(), expander.std()])

>>> df
  location  temp       mean        std
0       NY    60        NaN        NaN
1       NY    55  57.500000   3.535534
2       NY    70  61.666667   7.637626
3       NY    80  66.250000  11.086779
4       LA    80        NaN        NaN
5       LA    85  82.500000   3.535534
6       LA    75  80.000000   5.000000

注意:最好在.agg上使用expander,但从版本0.19.2开始,agg不复杂groupby.rolling可在groupby.expandingcontains_one = True guess = '1' while '1' in guess: print('No ones are allowed') guess = input('...') 上使用,因此无法使用。参见