我有一个以下格式的日志文件。
Wed Feb 21 00:59:32 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
'----action----tansfer'
'----failed----'
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
<Error occurred at line 44>
<html>
.....
....
....
</html>
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
我需要将日志格式设置为以下格式,以便我可以应用下行文本处理逻辑。
Wed Feb 21 00:59:32 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message '----action----tansfer' '----failed----'
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message <Error occurred at line 44> <html>.... ..... ....</html>
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
是否可以获取这种格式的日志消息。我在想类似的东西-如果换行符后没有日期正则表达式,则替换为空格字符,但不能完全构造正则表达式。
TIA
答案 0 :(得分:2)
以下代码将读取日志文件,然后以所需格式将其写回到out.txt文件中。在下一行中,我使用re进行负向超前的任务
import re
with open('log.txt', 'r') as f:
a = f.read()
a = re.sub(r'\n(?!Wed)', '', a)
with open('out.txt', 'w') as f:
f.write(a)
输出:
Wed Feb 21 00:59:32 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message '----action----tansfer' '----failed----'
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message <Error occurred at line 44><html>.............</html>
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
答案 1 :(得分:0)
此正则表达式字符串看起来像您所需要的:
'.*\d{2}\:\d{2}\:\d{2}\ \d{4}.*'
它尝试与此匹配:
00:59:33 2018 # Any number works as long as it's this format
答案 2 :(得分:0)
只是我的非正则表达式方法:
with open("./t.txt") as read_file: #Current Log file
with open("./fix_t.txt", 'w') as write_file: #A new log file
data = False
for line in read_file:
if "message" in line:
if data: write_file.write(data + "\n")
data = line.strip("\n")
else:
data += line.strip("\n")
if data: write_file.write(data + "\n")
产生新的日志文件:
Wed Feb 21 00:59:32 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message '----action----tansfer' '----failed----'
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message <Error occurred at line 44><html>.............</html>
Wed Feb 21 00:59:33 2018 XXXXXX.x1:00000: message