熊猫滚动窗口 - 标记值

时间:2018-04-18 17:42:16

标签: python pandas

在pandas数据框中,我想过滤一些列稳定在10.0单位内的行。

def abs_delta_fn(window):
   x = window[0]            
   for y in window[1:]:
      if abs(x-y) > 10.0:
         return False            
      return True

df['filter']= df['column'].rolling(5, min_periods=5).apply(abs_delta)

所以,如果有一个像这样的df

1   0
2   20
3   40
4   40
5   40
6   40
7   40
8   90
9   120
10  120

应用滚动窗口我得到:

1   0     nan
2   20    nan
3   40    nan
4   40    nan
5   40    False
6   40    False
7   40    True
8   90    False
9   120   False
10  120   False

我怎样才能以聪明的方式获得这个?

1  0     nan (or False)
2  20    nan (or False)
3  40    True
4  40    True
5  40    True
6  40    True
7  40    True
8  90    False
9  120   False
10 120   False

1 个答案:

答案 0 :(得分:1)

IIUC, you already know rolling, just adding apply after that , the key here is .iloc[::-1], cause rolling is from the current row looking up(backward), but you need forward

s=df.x.iloc[::-1].rolling(5,min_periods=5).apply(lambda x : (abs((x-x[0]))<10).all())
df.loc[df.index.difference(sum([list(range(x, x+5))for x in s[s==1].index.values],[]))]

Out[1119]: 
      x
1     0
2    20
8    90
9   120
10  120