我需要检查数组realtime
中是否有时间样本不是增量的(时间倒退)
realtime
Out[2]:
array([datetime.datetime(2017, 11, 3, 20, 25, 10, 724000),
datetime.datetime(2017, 11, 3, 20, 25, 10, 744000),
datetime.datetime(2017, 11, 3, 20, 25, 10, 764000), ...,
datetime.datetime(2017, 11, 4, 2, 13, 44, 704000),
datetime.datetime(2017, 11, 4, 2, 13, 44, 724000),
datetime.datetime(2017, 11, 4, 2, 13, 44, 744000)], dtype=object)
实时是1045702L!
我试着做了
d = pd.DataFrame(np.zeros((len(realtime), 1)))
for i in range(len(realtime)):
if any(realtime[i] <= x for x in realtime[:i]): # smaller/equal than any prior
d.iloc[i] = True
但它需要永远...... 有没有更快的方法来检查数组中的元素是否是增量的,如果没有标记它们?
答案 0 :(得分:4)
您可以使用array
timedelta numpy.diff
进行comapre 0
创建:
b = np.diff(realtime) > datetime.timedelta(0)
print (b)
[ True True True True True]
在pandas中,您可以转换为pd.Series
对象并使用diff
:
b = pd.Series(realtime).diff()
#replace first NaN value to 1
b.iat[0] = 1
print (b > pd.Timedelta(0))
0 True
1 True
2 True
3 True
4 True
5 True
dtype: bool
realtime
会自动投放到np.datetime64
,diff
会从Timedelta
生成realtime = np.array([datetime.datetime(2017, 11, 3, 20, 25, 10, 724000),
datetime.datetime(2017, 11, 3, 20, 25, 10, 744000),
datetime.datetime(2017, 11, 3, 20, 25, 10, 764000),
datetime.datetime(2017, 11, 4, 2, 13, 44, 704000),
datetime.datetime(2017, 11, 4, 2, 13, 44, 724000),
datetime.datetime(2017, 11, 4, 2, 13, 44, 744000)], dtype=object)
realtime = np.random.choice(realtime, size=1045702)
In [256]: %timeit[x.total_seconds() > 0 for x in np.diff(realtime)]
1 loop, best of 3: 382 ms per loop
In [257]: %timeit np.diff(realtime) > datetime.timedelta(0)
10 loops, best of 3: 88.2 ms per loop
In [258]: %timeit (pd.Series(realtime).diff() > pd.Timedelta(0))
10 loops, best of 3: 147 ms per loop
In [259]: %%timeit
...: b = pd.Series(realtime).diff()
...: b.iat[0] = 1
...:
...: b > pd.Timedelta(0)
...:
10 loops, best of 3: 149 ms per loop
个对象。
<强>计时强>:
{{1}}