我的SPI时间序列的长度为324,范围为-3到+3。我想获取连续3个或更多时间步长低于阈值-1的地方的索引
我对本网站和其他地方进行了彻底搜索,但未取得重大成功,例如Check if there are 3 consecutive values in an array which are above some threshold却不能完全满足我的要求
>0: option
>2: option
value: "6"
text: "apple"
spellcheck: true
textContent: "apple"
>3: option
value: "2"
text: "test"
spellcheck: true
textContent: "test"
>4: option
...
这就是我想做的事,我们将不胜感激, 谢谢
答案 0 :(得分:2)
这是一个有注释的分步食谱。
a = [-3,4,5,-1,-2,-5,1,4,6,9,-3,-3,-1,-2,4,1,4]
th = -1
a = np.array(a)
# create mask of events; find indices where mask switches
intervals = np.where(np.diff(a<=th, prepend=0, append=0))[0].reshape(-1,2)
# discard short stretches
intervals = intervals[np.subtract(*intervals.T) <= -3]
intervals
# array([[ 3, 6],
# [10, 14]])
# get corresponding data
stretches = np.split(a, intervals.reshape(-1))[1::2]
stretches
# [array([-1, -2, -5]), array([-3, -3, -1, -2])]
# count events
-np.subtract(*intervals.T)
# array([3, 4])
# sum events
np.add.reduceat(a, intervals.reshape(-1))[::2]
# array([-8, -9])
答案 1 :(得分:2)
自从您标记了熊猫:
s = pd.Series([-3,4,5,-1,-2,-5,1,4,6,9,-3,-3,-1,-2,4,1,4])
# thresholding
a = (s<1)
# blocks
b = (a!=a.shift()).cumsum()
# groupby
df = s[a].groupby(b).agg([list,'size','sum'])
df = df[df.size>=3]
输出
list size sum
3 [-1, -2, -5] 3 -8
5 [-3, -3, -1, -2] 4 -9
答案 2 :(得分:1)
使用np.logical_and.reduce
+ shift
,检查连续的行是否低于阈值。然后使用groupby来获取您需要的所有聚合:
import numpy as np
import pandas as pd
def get_grps(s, thresh=-1, Nmin=3):
"""
Nmin : int > 0
Min number of consecutive values below threshold.
"""
m = np.logical_and.reduce([s.shift(-i).le(thresh) for i in range(Nmin)])
if Nmin > 1:
m = pd.Series(m, index=s.index).replace({False: np.NaN}).ffill(limit=Nmin-1).fillna(False)
else:
m = pd.Series(m, index=s.index)
# Form consecutive groups
gps = m.ne(m.shift(1)).cumsum().where(m)
# Return None if no groups, else the aggregations
if gps.isnull().all():
return None
else:
return s.groupby(gps).agg([list, sum, 'size']).reset_index(drop=True)
get_grps(pd.Series(a))
# list sum size
#0 [-1, -2, -5] -8 3
#1 [-3, -3, -1, -2] -9 4
get_grps(pd.Series(a), thresh=-1, Nmin=1)
# list sum size
#0 [-3] -3 1
#1 [-1, -2, -5] -8 3
#2 [-3, -3, -1, -2] -9 4
get_grps(pd.Series(a), thresh=-100, Nmin=1)
#None