文本解析和检索

时间:2015-10-13 14:00:35

标签: python python-2.7

我需要编写一个脚本让我解析一个文本文件。每次我ACTION TYPE: Insertion时,我都需要检索下面的TIME值。

ACTION TYPE: Insertion
ISSUES: No
USER: ADMINISTRATOR
TIME: 2015-10-09 10.50.12
ACTION TYPE: Edition
ISSUES: No
USER: ADMINISTRATOR
TIME: 2015-10-09 11.21.34
ACTION TYPE: Insertion
ISSUES: No
USER: ADMINISTRATOR
TIME: 2015-10-09 12.19.22

2 个答案:

答案 0 :(得分:1)

对于可以轻松将整个文件加载到内存中的小文件,可以使用以下方法:

import re

with open('input.txt', 'r') as f_input:
    print re.findall(r'ACTION TYPE: Insertion.*?TIME: (.*?)$', f_input.read(), re.M+re.S)

您的样本会显示以下内容:

['2015-10-09 10.50.12', '2015-10-09 12.19.22']

答案 1 :(得分:0)

与马丁埃文斯一样的想法,但有一个更简单的模式:

import re
with open('yourfile.txt', 'r') as f:
    pat = re.compile(r'ACTION TYPE: Insertion\nISSUES: .*\nUSER: .*\nTIME: (.*)')
    insertion_times = re.findall(pat, f.read())