Question

我正在制作一个快速的日志解析工具：

findme = 'important '
logf = file('new.txt')
newlines = []

    for line in logf:
        if findme in line:
            line.partition("as follows: ")[2]
            newlines.append(line) 


outfile = file('out.txt', 'w')
outfile.writelines(newlines)

我不知道如何使用分区之类的东西来删除“如下：”以及每行之前的所有内容。我没有收到任何错误，但我试图删除的文本仍保留在输出中。

Answer 1

另外，我对这条线感到有点困惑

line.partition("as follows: ")[2]

。它什么都不做。也许你想要

line = line.partition("as follows")[2]

？顺便说一句，最好只在for循环中写入每一行而不是在末尾写一个巨大的writelines。您当前的解决方案将为大型文件使用大量内存，而根本无法使用无限文件。

最终版本看起来像这样：

findme = 'important '
outfile = open('out.txt', 'w')
for line in open('new.txt'):
    if findme in line:
        outfile.write(line.partition('as follows: ')[2])

Answer 2

这里有正则表达式

import re

findme = 'important ' 
pat = re.compile('.*(%s)?.*as follows: ((?(1).*\n|.*%s.*\n))' % (findme,findme))

with open('new.txt','r') as logf, open('out.txt','w') as outfile:
    for line in logf:
        m = pat.match(line)
        if m: outfile.write( m.group(2) )

优势在于它可以搜索更多特定项目，而不仅仅是'，如果在行'指令中找到了例如，对于findme = '(?<!A)AAA(?!A)'，它会根据严格的'AAA'进行搜索，而不是'AAAA'。

python处理日志文件和剥离字符

2 个答案: