我有一个日志文件,其中包含不同行中的不同Mac地址。
我可以提取包含给定Mac地址的行,然后我可以修剪该行以仅获取时间戳(例如15:48:55)然后我将此时间戳添加到数组中。
我想要的是什么;
如果时间戳减法大于2秒,如何比较数组的所有相邻元素?
日志文件示例:
info: 02-03-2018, 15:48:55.730, 192.168.1.4, 33826, 5C-CF-7F-29-AB-73, 22496
info: 02-03-2018, 15:48:55.894, 192.168.1.6, 17948, A0-20-A6-0A-F2-AB, 22475
info: 02-03-2018, 15:48:56.031, 192.168.1.3, 32538, A0-20-A6-0A-2F-D5, 22510
info: 02-03-2018, 15:48:56.742, 192.168.1.7, 40596, 60-01-94-16-05-96, 22490
info: 02-03-2018, 15:48:57.475, 192.168.1.5, 30646, 5C-CF-7F-DA-5B-77, 22668
info: 02-03-2018, 15:48:57.780, 192.168.1.4, 39592, 5C-CF-7F-29-AB-73, 22497
info: 02-03-2018, 15:48:57.922, 192.168.1.6, 21467, A0-20-A6-0A-F2-AB, 22476
info: 02-03-2018, 15:48:58.055, 192.168.1.3, 13001, A0-20-A6-0A-2F-D5, 22511
info: 02-03-2018, 15:48:58.760, 192.168.1.7, 31030, 60-01-94-16-05-96, 22491
info: 02-03-2018, 15:48:59.487, 192.168.1.5, 46505, 5C-CF-7F-DA-5B-77, 22669
到目前为止我得到了什么:
from datetime import datetime
import os
import re
# Regex used to match relevant loglines (in this case, a specific MAC address)
line_regex = re.compile(r'A0-20-A6-0A-F2-AB')
with open("info.log", "r") as in_file:
# Loop over each log line
for line in in_file:
# If log line matches our regex
if (line_regex.search(line)):
#Extract the timestamp as 15:48:55
asd = line[18:26]
#Convert to datetime_object
datetime_object = datetime.strptime(asd, '%H:%M:%S')
#Trim begining of the datetime object
dsa = datetime_object.strftime ('%H:%M:%S')
#Add to an array as a timestamp
for j in range(1):
array1=[]
for i in range(1):
array1.append(dsa)
print array1
答案 0 :(得分:2)
你真的不需要额外的模块,你只需要使用datetime对象,而不是字符串。
from datetime import datetime
import os
import re
line_regex = re.compile(r'A0-20-A6-0A-F2-AB')
prev_dsa = None
with open("info.log", "r") as in_file:
for line in in_file:
if (line_regex.search(line)):
asd = line[6:26]
dsa = datetime.strptime(asd, '%m-%d-%Y, %H:%M:%S')
if (prev_dsa != None):
if abs((dsa - prev_dsa).seconds) >= 2:
print dsa
prev_dsa = dsa
答案 1 :(得分:0)
您可以使用pandas
。注意:这可以很容易地进行调整以考虑日期,但是它只考虑时间组件。
df = pd.read_csv('file.csv', header=None, delimiter=', ')
df['Diff'] = pd.to_datetime(df[1], format='%H:%M:%S.%f').diff().dt.microseconds / 10**6
# greater than half a second differences
res = df[df['Diff'] > 0.5]
# 0 1 2 3 4 \
# 3 info: 02-03-2018 15:48:56.742 192.168.1.7 40596 60-01-94-16-05-96
# 4 info: 02-03-2018 15:48:57.475 192.168.1.5 30646 5C-CF-7F-DA-5B-77
# 8 info: 02-03-2018 15:48:58.760 192.168.1.7 31030 60-01-94-16-05-96
# 9 info: 02-03-2018 15:48:59.487 192.168.1.5 46505 5C-CF-7F-DA-5B-77
# 5 Diff
# 3 22490 0.711
# 4 22668 0.733
# 8 22491 0.705
# 9 22669 0.727