在pandas数据框中,我想过滤一些列稳定在10.0单位内的行。
def abs_delta_fn(window):
x = window[0]
for y in window[1:]:
if abs(x-y) > 10.0:
return False
return True
df['filter']= df['column'].rolling(5, min_periods=5).apply(abs_delta)
所以,如果有一个像这样的df
1 0
2 20
3 40
4 40
5 40
6 40
7 40
8 90
9 120
10 120
应用滚动窗口我得到:
1 0 nan
2 20 nan
3 40 nan
4 40 nan
5 40 False
6 40 False
7 40 True
8 90 False
9 120 False
10 120 False
我怎样才能以聪明的方式获得这个?
1 0 nan (or False)
2 20 nan (or False)
3 40 True
4 40 True
5 40 True
6 40 True
7 40 True
8 90 False
9 120 False
10 120 False
答案 0 :(得分:1)
IIUC, you already know rolling
, just adding apply
after that , the key here is .iloc[::-1]
, cause rolling is from the current row looking up(backward), but you need forward
s=df.x.iloc[::-1].rolling(5,min_periods=5).apply(lambda x : (abs((x-x[0]))<10).all())
df.loc[df.index.difference(sum([list(range(x, x+5))for x in s[s==1].index.values],[]))]
Out[1119]:
x
1 0
2 20
8 90
9 120
10 120