用python 2.7滚动平均值

时间:2016-03-31 03:39:02

标签: python-2.7 pandas

我想使用Python 2.7 pandas编写m_tax的滚动平均代码来分析来自网页的时间序列数据(http://owww.met.hu/eghajlat/eghajlati_adatsorok/bp/Navig/202_EN.htm)。

   datum  m_ta m_tax     m_taxd m_tan     m_tand
------- ----- ----- ---------- ----- ----------
1901-01  -4.7   5.0 1901-01-23 -12.2 1901-01-10
1901-02  -2.1   3.5 1901-02-06  -7.9 1901-02-15
1901-03   5.8  13.5 1901-03-20   0.6 1901-03-01
1901-04  11.6  18.2 1901-04-10   7.4 1901-04-23
1901-05  16.8  22.5 1901-05-31  12.2 1901-05-05
1901-06  21.0  24.8 1901-06-03  14.6 1901-06-17
1901-07  22.4  27.4 1901-07-30  16.9 1901-07-04
1901-08  20.7  25.9 1901-08-01  14.7 1901-08-29
....

在这里,我尝试了我的代码:

 pd.rolling_mean(df.resample("1M", fill_method="ffill"), window=60,   min_periods=1, center=True).mean()

我得到了结果:

m_ta            11.029173
m_tax           17.104283
m_tan            4.848637
month            6.499500
monthly_mean    11.030405
monthly_std      1.836159
m_tax%           0.083348
m_tan%           0.023627
dtype: float64

另一种方式我试过:

s = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/1900',   periods=1000))
s = s.cumsum()
r = s.rolling(window=60)
r.mean()

我得到了结果

1900-01-01          NaN
1900-01-02          NaN
1900-01-03          NaN
1900-01-04          NaN
1900-01-05          NaN
1900-01-06          NaN
1900-01-07          NaN
1900-01-08          NaN
...

所以我在这里很困惑。我应该使用哪一个?有人可以给我一点想法吗?谢谢!

1 个答案:

答案 0 :(得分:0)

从版本0.18.0开始,rolling()resample()都是与groupby()类似的行为,并且不作为函数弃用。

What's new in pandas version 0.18.0

rolling()/expanding() in pandas version 0.18.0

resample() in pandas version 0.18.0

我无法确切地说出你想要的结果是什么,但也许这样的东西是你想要的? (你可以看到下面的警告信息,虽然我不确定是什么触发它。)

>>> df

            m_ta  m_tax      m_taxd  m_tan      m_tand
datum                                                 
1901-01-01  -4.7    5.0  1901-01-23  -12.2  1901-01-10
1901-02-01  -2.1    3.5  1901-02-06   -7.9  1901-02-15
1901-03-01   5.8   13.5  1901-03-20    0.6  1901-03-01
1901-04-01  11.6   18.2  1901-04-10    7.4  1901-04-23
1901-05-01  16.8   22.5  1901-05-31   12.2  1901-05-05
1901-06-01  21.0   24.8  1901-06-03   14.6  1901-06-17
1901-07-01  22.4   27.4  1901-07-30   16.9  1901-07-04
1901-08-01  20.7   25.9  1901-08-01   14.7  1901-08-29

>>> df.resample("1M").rolling(3,center=True,min_periods=1).mean()

/Users/john/anaconda/lib/python3.5/site-packages/ipykernel/__main__.py:1: FutureWarning: .resample() is now a deferred operation
use .resample(...).mean() instead of .resample(...)
  if __name__ == '__main__':

                 m_ta      m_tax      m_tan
datum                                      
1901-01-31  -3.400000   4.250000 -10.050000
1901-02-28  -0.333333   7.333333  -6.500000
1901-03-31   5.100000  11.733333   0.033333
1901-04-30  11.400000  18.066667   6.733333
1901-05-31  16.466667  21.833333  11.400000
1901-06-30  20.066667  24.900000  14.566667
1901-07-31  21.366667  26.033333  15.400000
1901-08-31  21.550000  26.650000  15.800000