我正在尝试学习Python的Pandas库,然后我遇到了"滚动窗口"的概念。用于时间序列分析。我从来都不是统计学的好学生,所以我有点失落。
请解释这个概念,最好使用一个简单的例子,也许还有一个代码片段。
答案 0 :(得分:4)
演示:
设定:
In [11]: df = pd.DataFrame({'a':np.arange(10, 17)})
In [12]: df
Out[12]:
a
0 10
1 11
2 12
3 13
4 14
5 15
6 16
2 rows
窗口的滚动总和:
In [13]: df['a'].rolling(2).sum()
Out[13]:
0 NaN # sum of the current and previous value: 10 + NaN = NaN
1 21.0 # sum of the current and previous value: 10 + 11
2 23.0 # sum of the current and previous value: 11 + 12
3 25.0 # ...
4 27.0
5 29.0
6 31.0
Name: a, dtype: float64
3 rows
窗口的滚动总和:
In [14]: df['a'].rolling(3).sum()
Out[14]:
0 NaN # sum of current value and two preceeding rows: 10 + NaN + Nan
1 NaN # sum of current value and two preceeding rows: 10 + 11 + Nan
2 33.0 # sum of current value and two preceeding rows: 10 + 11 + 12
3 36.0 # ...
4 39.0
5 42.0
6 45.0
Name: a, dtype: float64