有2个日志文件:log A
和log B
。
log A
2015-07-12 08:50:33,904 [Collection-3]INFO app -Executing Scheduled job: System: choppa1
2015-07-12 09:56:45,060 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2015-07-12 10:00:00,001 [Analytics_Worker-1] INFO app - Trigger for job AnBuildAuthorizationJob was fired.
2015-07-12 11:00:00,007 [Analytics_Worker-1] INFO app - Starting the AnBuildAuthorizationJob job.
log B
2014-07-12 09:50:33,904 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2014-07-12 09:56:45,060 [Collection-3] INFO app - Executing Scheduled job: System: choppa1
2014-07-12 10:00:00,001 [Analytics_Worker-1] INFO app - Trigger for job AnBuildAuthorizationJob was fired.
2014-07-12 10:00:00,007 [Analytics_Worker-1] INFO app - Starting the AnBuildAuthorizationJob job.
2个日志文件具有相同的内容,但时间戳不同。我需要通过忽略时间戳比较2个文件,即比较两个文件的每一行,即使它们有不同的时间戳,也不应该报告任何差异。我为此编写了以下python脚本:
#!/usr/bin/python
import re
import difflib
program = open("log1.txt", "r")
program_contents = program.readlines()
program.close()
new_contents = []
pat = re.compile("^[^0-9]")
for line in program_contents:
if re.search(pat, line):
new_contents.append(line)
program = open("log2.txt", "r")
program_contents1 = program.readlines()
program.close()
new_contents1 = []
pat = re.compile("^[^0-9]")
for line in program_contents1:
if re.search(pat, line):
new_contents1.append(line)
diff=difflib.ndiff(new_contents,new_contents1)
print(''.join(diff))
是否有更有效的方式编写上述脚本?并且上述脚本仅在时间戳位于行的开头时才起作用。我想编写一个python脚本,即使时间戳位于行中间的某个位置也应该有效。谁能帮助我怎么做?
答案 0 :(得分:0)
I would change pat = re.compile("^[^0-9]")
to pat = re.compile("\d{4}-d{2}-d{2}
并且最好打开文件
with open(filename) as f:
这样python会为你关闭文件,不需要关闭(f)语句。
答案 1 :(得分:0)
这是从文件开头消除时间戳的小脚本。
program = open("log1.txt", "r")
program_contents = program.readlines()
program.close()
program = open("log2.txt", "r")
program_contents1 = program.readlines()
program.close()
for i in range(0,len(program_contents1)):
if program_contents[i] == '\n':
continue
if program_contents[i][19:] == program_contents1[i][19:]:
print("Matches")