我有两个大型列表t
和y
,我希望以高效的方式确定y
中的数据超过预定义limit
的时间和持续时间,即>=limit
。
可以使用以下示例数据说明问题:
t = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
y = [8,6,4,2,0,2,4,6,8,6,4,2,0,2,4,6,8]
limit = 4
在此示例中,代码应返回以下列表:
t_exceedance_start = [0,6,14]
t_how_long_above_limit = [2,4,2]
我希望Numpy
可以很好地实现这一点,但是没有找到方法。
非常感谢任何建议。
答案 0 :(得分:1)
这是一种利用布尔效率提升的矢量化方法 -
# Get array versions if aren't already
y = np.asarray(y)
t = np.asarray(t)
# Get mask of thresholded y with boundaries of False on either sides.
# The intention is to use one-off shifted comparison to catch the
# boundaries of each island of thresholed True values (done in next step).
# Those appended False values act as triggers to catch the start of
# first island and end of last island.
mask = np.concatenate(( [False], y>=limit, [False] ))
idx = np.flatnonzero(mask[1:] != mask[:-1])
# The starting indices for each island would be the indices at steps of 2.
# The ending indices would be steps of 2 as well starting from first index.
# Thus, get the island lengths by simply differencing between start and ends.
starts = idx[::2]
ends = idx[1::2] - 1
lens = ends - starts
# Get starts, ends, lengths according to t times
start_times = t[starts]
end_times = t[ends]
len_times = end_times - start_times