熊猫:你可以访问滚动窗口项吗?

时间:2016-09-07 15:31:51

标签: python pandas

您可以访问pandas滚动窗口对象吗?

rs = pd.Series(range(10))
rs.rolling(window = 3)

#print's 
Rolling [window=3,center=False,axis=0]

我能成为小组吗?:

[0,1,2]
[1,2,3]
[2,3,4]

2 个答案:

答案 0 :(得分:4)

我将开始说这是达到into the internal impl。但是,如果你真的想以与熊猫相同的方式计算索引器。

你需要v0.19.0rc1(刚刚发布),你可以conda install -c pandas pandas=0.19.0rc1

In [41]: rs = pd.Series(range(10))

In [42]: rs
Out[42]: 
0    0
1    1
2    2
3    3
4    4
5    5
6    6
7    7
8    8
9    9
dtype: int64

# this reaches into an internal implementation
# the first 3 is the window, then second the minimum periods we
# need    
In [43]: start, end, _, _, _, _ = pandas._window.get_window_indexer(rs.values,3,3,None,use_mock=False)

# starting index
In [44]: start
Out[44]: array([0, 0, 0, 1, 2, 3, 4, 5, 6, 7])

# ending index
In [45]: end
Out[45]: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

# windo size
In [3]: end-start
Out[3]: array([1, 2, 3, 3, 3, 3, 3, 3, 3, 3])

# the indexers
In [47]: [np.arange(s, e) for s, e in zip(start, end)]
Out[47]: 
[array([0]),
 array([0, 1]),
 array([0, 1, 2]),
 array([1, 2, 3]),
 array([2, 3, 4]),
 array([3, 4, 5]),
 array([4, 5, 6]),
 array([5, 6, 7]),
 array([6, 7, 8]),
 array([7, 8, 9])]

所以这在固定窗口的情况下是微不足道的,这在变量窗口场景中变得非常有用,例如,在0.19.0中,您可以指定2S之类的内容,例如按时间聚合。

所有这些都说,获得这些索引器并不是特别有用。你通常想要一些结果。这是聚合函数的要点,如果你想要通常聚合,那就是.apply

答案 1 :(得分:2)

这是一个解决方法,但是等着看是否有人有熊猫解决方案:

def rolling_window(a, step):
    shape   = a.shape[:-1] + (a.shape[-1] - step + 1, step)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

rolling_window(rs, 3)

array([[ 0,  1,  2],
       [ 1,  2,  3],
       [ 2,  3,  4],
       [ 3,  4,  5],
       [ 4,  5,  6],
       [ 5,  6,  7],
       [ 6,  7,  8],
       [ 7,  8,  9],
       [ 8,  9, 10]])