我正在处理文本文件中的一些数据提取。我一直在使用MATLAB一段时间,但似乎有点压力。我开始使用python进行提取。现在我有一个非常复杂的问题,我甚至不知道如何做到这一点。
这是我到目前为止所做的。 我有一个看起来像这样的日志文件:
2017-12-21T23:59:19.120Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:19.120Z 'D|Beat:2143|B->'
2017-12-21T23:59:19.120Z 'D|Beat:2113|sndB:0x5caa'
2017-12-21T23:59:19.175Z 'I|PSnd: 61|snd[3D]:FFFF m:0x5caa e:0'
2017-12-21T23:59:19.175Z 'I|PSnd: 233|sD[3D]m:0x5caa e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1259|WDTimeout: 300'
2017-12-21T23:59:19.175Z 'D|Beat:1282|sd:0x5caa: e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1302|sprts'
2017-12-21T23:59:19.175Z 'D|LgPl: 68|BSP:getSize:19'
2017-12-21T23:59:19.175Z 'D|Beat:5503|GetPckt:0x4e5e'
2017-12-21T23:59:19.175Z 'D|Beat:7140|Prtns->'
2017-12-21T23:59:19.175Z 'D|Beat:2008|sevt:72'
2017-12-21T23:59:19.175Z 'I|Beat:2021|SndQ:1'
2017-12-21T23:59:19.175Z 'D|Beat:1805|snd:0x4e5e'
2017-12-21T23:59:19.175Z 'I|PSnd: 61|snd[B0]:FFFF m:0x4e5e e:0'
2017-12-21T23:59:19.175Z 'I|PSnd: 233|sD[B0]m:0x4e5e e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1866|sd:0x4e5e:0'
2017-12-21T23:59:19.175Z 'D|Beat:1192|drop:2402 q:43'
2017-12-21T23:59:19.301Z 'D|Beat:1220|Rcv<-RP, s:2402'
2017-12-21T23:59:19.301Z 'D|LgPl: 68|BSP:getSize:19'
2017-12-21T23:59:19.301Z 'I|Beat:1243|RcvQ:1'
2017-12-21T23:59:19.301Z 'D|Beat:1245|FrMsg:0x4cc0 QMsg:0x3ba4'
2017-12-21T23:59:19.301Z 'D|Beat:8934|AAltB->B1302'
2017-12-21T23:59:19.416Z 'D|Beat:1192|drop:2402 q:50'
2017-12-21T23:59:19.416Z 'D|Beat:10392|RStp'
2017-12-21T23:59:19.437Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:19.489Z 'D|Beat:6502|slt:2'
2017-12-21T23:59:19.489Z 'D|Beat:10341|RStrt'
2017-12-21T23:59:19.489Z 'D|Beat:4713|prtTS:2'
2017-12-21T23:59:19.489Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:19.552Z 'D|Beat:1192|drop:2402 q:36'
2017-12-21T23:59:19.820Z 'D|Beat:1192|drop:2402 q:48'
2017-12-21T23:59:19.820Z 'D|Beat:10747|PLife:67'
2017-12-21T23:59:19.820Z 'D|Beat:4906|nojump'
2017-12-21T23:59:19.820Z 'D|Beat:10392|RStp'
2017-12-21T23:59:19.820Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:19.873Z 'D|Beat:6502|slt:3'
2017-12-21T23:59:20.266Z 'D|Beat:6502|slt:4'
2017-12-21T23:59:20.266Z 'D|Beat:10341|RStrt'
2017-12-21T23:59:20.266Z 'D|Beat:4713|prtTS:4'
2017-12-21T23:59:20.266Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:20.318Z 'D|Beat:1192|drop:2301 q:49'
2017-12-21T23:59:20.339Z 'D|Beat:1358|drop:2301 q:49'
2017-12-21T23:59:20.339Z 'D|Beat:1220|Rcv<-RP, s:2402'
2017-12-21T23:59:20.339Z 'D|LgPl: 68|BSP:getSize:19'
2017-12-21T23:59:20.339Z 'I|Beat:1243|RcvQ:1'
2017-12-21T23:59:20.339Z 'D|Beat:1245|FrMsg:0x4192 QMsg:0x4cc0'
2017-12-21T23:59:20.339Z 'D|Beat:1192|drop:2402 q:48'
2017-12-21T23:59:20.454Z 'D|Beat:1192|drop:2402 q:51'
2017-12-21T23:59:20.579Z 'D|Beat:1192|drop:2402 q:48'
2017-12-21T23:59:20.610Z 'D|Beat:10747|PLife:68'
2017-12-21T23:59:20.610Z 'D|Beat:4906|nojump'
2017-12-21T23:59:20.610Z 'D|Beat:10392|RStp'
2017-12-21T23:59:20.610Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:20.632Z 'D|Beat:6502|slt:5'
2017-12-21T23:59:21.045Z 'D|Beat:6502|slt:6'
2017-12-21T23:59:21.045Z 'D|Beat:10341|RStrt'
2017-12-21T23:59:21.045Z 'D|Beat:4713|prtTS:6'
2017-12-21T23:59:21.045Z 'D|Beat: 971|RStrtD'
现在我需要提取包含RStrtD
的任何行,然后是另一行RStpD
,然后找到它们之间的时间差异,对于文本文件中的每种情况,然后将时间加在一起。
我使用下面的代码提取:
print" trying out something spectacular"
def get_line(file_name, find_word1, find_word2):
lines = []
for line in file_name.strip().split('\n'):
if find_word1 in line:
lines.append(line)
elif find_word2 in line:
lines.append(line)
else:
pass
return lines
def get_all_lines(f_name, find_word1, find_word2):
f_content = open(f_name, 'r').read()
return get_line(f_content,find_word1, find_word2)
def get_files_in (in_file, find_word1,find_word2, out_file):
filtererd_lines = get_all_lines(in_file, find_word1, find_word2)
joinliens = '\n'.join(filtererd_lines)
open(out_file, 'w').write(joinliens)
#fix= "mm", "cts"
get_files_in("./sss1.txt", "RStrtD", "RStpD", "./result1.txt")
运行之后,我收到了以下输出:
2017-12-21T23:59:43.561Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:44.419Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:44.715Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:46.730Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:47.062Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:48.273Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:48.625Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:49.487Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:49.783Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:51.789Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:52.122Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:53.334Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:53.680Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:54.529Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:54.835Z 'D|Beat: 997|RStpD'
2017-12-21T23:59:56.840Z 'D|Beat: 971|RStrtD'
2017-12-21T23:59:57.182Z 'D|Beat: 997|RStpD'
这很好,但我现在需要在每一行中减去彼此的时间,然后取所有这些差异的总和。我真的不知道我怎么能这样做。我还不熟悉python中的时间向量。