您可以访问pandas滚动窗口对象吗?
rs = pd.Series(range(10))
rs.rolling(window = 3)
#print's
Rolling [window=3,center=False,axis=0]
我能成为小组吗?:
[0,1,2]
[1,2,3]
[2,3,4]
答案 0 :(得分:4)
我将开始说这是达到into the internal impl。但是,如果你真的想以与熊猫相同的方式计算索引器。
你需要v0.19.0rc1(刚刚发布),你可以conda install -c pandas pandas=0.19.0rc1
In [41]: rs = pd.Series(range(10))
In [42]: rs
Out[42]:
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
dtype: int64
# this reaches into an internal implementation
# the first 3 is the window, then second the minimum periods we
# need
In [43]: start, end, _, _, _, _ = pandas._window.get_window_indexer(rs.values,3,3,None,use_mock=False)
# starting index
In [44]: start
Out[44]: array([0, 0, 0, 1, 2, 3, 4, 5, 6, 7])
# ending index
In [45]: end
Out[45]: array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# windo size
In [3]: end-start
Out[3]: array([1, 2, 3, 3, 3, 3, 3, 3, 3, 3])
# the indexers
In [47]: [np.arange(s, e) for s, e in zip(start, end)]
Out[47]:
[array([0]),
array([0, 1]),
array([0, 1, 2]),
array([1, 2, 3]),
array([2, 3, 4]),
array([3, 4, 5]),
array([4, 5, 6]),
array([5, 6, 7]),
array([6, 7, 8]),
array([7, 8, 9])]
所以这在固定窗口的情况下是微不足道的,这在变量窗口场景中变得非常有用,例如,在0.19.0中,您可以指定2S
之类的内容,例如按时间聚合。
所有这些都说,获得这些索引器并不是特别有用。你通常想要做一些结果。这是聚合函数的要点,如果你想要通常聚合,那就是.apply
。
答案 1 :(得分:2)
这是一个解决方法,但是等着看是否有人有熊猫解决方案:
def rolling_window(a, step):
shape = a.shape[:-1] + (a.shape[-1] - step + 1, step)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
rolling_window(rs, 3)
array([[ 0, 1, 2],
[ 1, 2, 3],
[ 2, 3, 4],
[ 3, 4, 5],
[ 4, 5, 6],
[ 5, 6, 7],
[ 6, 7, 8],
[ 7, 8, 9],
[ 8, 9, 10]])