Question

我想过滤日志文件以保持所有行符合特定模式。我想用Python做到这一点。

这是我的第一次尝试：

#!/usr/bin/env python

from sys import argv 

script, filename = argv
with open(filename) as f:
    for line in f:
        try:
            e = line.index("some_term_I_want_to_match")
        except: 
            pass
        else:
            print(line)

我该如何改进：

将结果保存到类似名称的新文件（即不同的扩展名）
使用正则表达式使其更加灵活/强大。

（我只是在学习Python。这个问题与学习Python有关，也与实现这一特定结果有关。）

好的，这就是我到目前为止所提出的...... 但是，你如何做到等同于r的前缀，如下一行

re.compile(r"\s*")

字符串是不是字符串文字，如下一行所示？

re.compile(a_string_variable)

除此之外，我认为这个更新版本可以胜任：

#!/usr/bin/env python

from sys import argv 
import re
import os
import argparse #requires Python 2.7 or above

parser = argparse.ArgumentParser(description='filters a text file on the search phrase')
parser.add_argument('-s','--search', help='search phrase or keyword to match',required=True)
parser.add_argument('-f','--filename', help='input file name',required=True)
parser.add_argument('-v','--verbose', help='display output to the screen too', required=False, action="store_true")
args = parser.parse_args()

keyword = args.search
original_file = args.filename
verbose = args.verbose

base_file, ext = os.path.splitext(original_file)
new_file = base_file + ".filtered" + ext

regex_c = re.compile(keyword)

with open(original_file) as fi:
    with open(new_file, 'w') as fo:
        for line in fi:
            result = regex_c.search(line)
            if(result):
                fo.write(line)
                if(verbose):
                    print(line)

这可以轻松改进吗？

Answer 1

嗯，你知道，你已经自己回答了大部分问题：）

对于正则表达式匹配，请使用re module（该文档有很多解释性示例）。

您已经使用open()功能打开文件。使用相同的函数打开文件进行编写，只需提供相应的mode参数（“w”或“a”与“+”结合使用，如果需要，请参阅Python交互式shell中的help(open)）。就是这样。

使用Python过滤Linux日志文件

1 个答案: