我正在处理日志。需要计算流程运行时持续时间的总和,而不会造成长时间的中断。将最大可能中断时间设置为30秒。每3秒发出一次日志。
因此,例如 ,如果它从10:20:00
(小时)到10:30:00
(小时)开始运行,并且从10:24:10
到10:27:10
被中断,则期望的结果是10:24:10
-10:20:00
和10:30:00
-10:27:10
= 420
的总和(以秒为单位)。但是,使用datetime
类型计算时差不能提供有效的解决方案-我想它可以计算时差而不包含开始/结束秒数。
这是我想出的解决方案(['timestamps']是通常每3秒发出的datetime
个时间戳的列表)
for k, v in proc_activity.items():
proc_activity[k]['duration'] = 0
start, next = v['timestamps'][0], ''
for time in v['timestamps']:
next = time
diff = next - start
if diff.seconds < 30:
proc_activity[k]['duration'] += diff.seconds
else:
print("diff: %s" % diff.seconds)
start = next
print(f"added: {proc_activity[k]['duration']}")
diff = v['timestamps'][-1] - v['timestamps'][0]
print(f"real: {diff.seconds}")
输出:
added: 39
real: 45
added: 39
real: 45
diff: 36
added: 155
real: 218
任何建议如何解决?
更新,示例输入数据:
{'service_0': {'timestamps': [datetime.datetime(2018, 7, 1, 22, 33, 39, 86170),
datetime.datetime(2018, 7, 1, 22, 33, 42, 33213),
datetime.datetime(2018, 7, 1, 22, 33, 44, 898234),
datetime.datetime(2018, 7, 1, 22, 33, 47, 893731),
datetime.datetime(2018, 7, 1, 22, 33, 50, 928946),
datetime.datetime(2018, 7, 1, 22, 33, 53, 895617),
datetime.datetime(2018, 7, 1, 22, 35, 7, 116182),
datetime.datetime(2018, 7, 1, 22, 35, 10, 105035),
datetime.datetime(2018, 7, 1, 22, 35, 13, 193428),
datetime.datetime(2018, 7, 1, 22, 35, 16, 210135),
datetime.datetime(2018, 7, 1, 22, 35, 19, 168881),
datetime.datetime(2018, 7, 1, 22, 35, 22, 114653),
datetime.datetime(2018, 7, 1, 22, 35, 25, 102365),
datetime.datetime(2018, 7, 1, 22, 35, 43, 46950),
datetime.datetime(2018, 7, 1, 22, 35, 46, 15435),
datetime.datetime(2018, 7, 1, 22, 35, 49, 23333),
datetime.datetime(2018, 7, 1, 22, 35, 52, 22164),
datetime.datetime(2018, 7, 1, 22, 35, 55, 78615),
datetime.datetime(2018, 7, 1, 22, 35, 58, 78573)]}}
答案 0 :(得分:3)
简而言之,我认为您缺少的关键是使用timedelta.total_seconds()
而不是timedelta.seconds
这对我来说似乎很好:
import datetime
from pprint import pprint
def get_duration(timestamps):
max_interruption = 30
starts = timestamps[:-1]
ends = timestamps[1:]
durations = zip(starts, ends)
accumulated = 0
for start, end in durations:
delta = (end - start).total_seconds()
if delta < max_interruption:
accumulated += delta
return accumulated
proc_activity = {
'service_0': {
'timestamps': [
datetime.datetime(2018, 7, 1, 22, 33, 39, 86170),
datetime.datetime(2018, 7, 1, 22, 33, 42, 33213),
datetime.datetime(2018, 7, 1, 22, 33, 44, 898234),
datetime.datetime(2018, 7, 1, 22, 33, 47, 893731),
datetime.datetime(2018, 7, 1, 22, 33, 50, 928946),
datetime.datetime(2018, 7, 1, 22, 33, 53, 895617),
datetime.datetime(2018, 7, 1, 22, 35, 7, 116182),
datetime.datetime(2018, 7, 1, 22, 35, 10, 105035),
datetime.datetime(2018, 7, 1, 22, 35, 13, 193428),
datetime.datetime(2018, 7, 1, 22, 35, 16, 210135),
datetime.datetime(2018, 7, 1, 22, 35, 19, 168881),
datetime.datetime(2018, 7, 1, 22, 35, 22, 114653),
datetime.datetime(2018, 7, 1, 22, 35, 25, 102365),
datetime.datetime(2018, 7, 1, 22, 35, 43, 46950),
datetime.datetime(2018, 7, 1, 22, 35, 46, 15435),
datetime.datetime(2018, 7, 1, 22, 35, 49, 23333),
datetime.datetime(2018, 7, 1, 22, 35, 52, 22164),
datetime.datetime(2018, 7, 1, 22, 35, 55, 78615),
datetime.datetime(2018, 7, 1, 22, 35, 58, 78573)
],
}
}
for k,v in proc_activity.items():
proc_activity[k]['duration'] = get_duration(v['timestamps'])
pprint(proc_activity)
持续时间65.77183800000002
秒
答案 1 :(得分:3)
My try to the problem using generators:
import datetime
def calculate(timestamps, largest_interrupt = 30):
begin_t, last_good_t = timestamps[0], timestamps[0]
for current_t, previous_t in zip(timestamps[1:], timestamps):
if (current_t - last_good_t).total_seconds() < largest_interrupt:
last_good_t = current_t
continue
yield (previous_t - begin_t).total_seconds()
last_good_t, begin_t = current_t, current_t
yield (current_t - begin_t).total_seconds()
sample_data = {'service_0': {'timestamps': [datetime.datetime(2018, 7, 1, 22, 33, 39, 86170),
datetime.datetime(2018, 7, 1, 22, 33, 42, 33213),
datetime.datetime(2018, 7, 1, 22, 33, 44, 898234),
datetime.datetime(2018, 7, 1, 22, 33, 47, 893731),
datetime.datetime(2018, 7, 1, 22, 33, 50, 928946),
datetime.datetime(2018, 7, 1, 22, 33, 53, 895617),
datetime.datetime(2018, 7, 1, 22, 35, 7, 116182),
datetime.datetime(2018, 7, 1, 22, 35, 10, 105035),
datetime.datetime(2018, 7, 1, 22, 35, 13, 193428),
datetime.datetime(2018, 7, 1, 22, 35, 16, 210135),
datetime.datetime(2018, 7, 1, 22, 35, 19, 168881),
datetime.datetime(2018, 7, 1, 22, 35, 22, 114653),
datetime.datetime(2018, 7, 1, 22, 35, 25, 102365),
datetime.datetime(2018, 7, 1, 22, 35, 43, 46950),
datetime.datetime(2018, 7, 1, 22, 35, 46, 15435),
datetime.datetime(2018, 7, 1, 22, 35, 49, 23333),
datetime.datetime(2018, 7, 1, 22, 35, 52, 22164),
datetime.datetime(2018, 7, 1, 22, 35, 55, 78615),
datetime.datetime(2018, 7, 1, 22, 35, 58, 78573)]}}
for k, v in sample_data.items():
s = sum(calculate(v['timestamps']))
print(f"Service '{k}' has duration of '{s}' seconds")
The program prints Service 'service_0' has duration of '65.771838' seconds