我有一个日志输出,总结如下。我需要解析最终输入,该输入跨越多行。我找不到有效的正则表达式。
04/10/2019 02:52:59 PM INFO: Model Details:
04/10/2019 02:53:12 PM INFO: Final Input: [ 220.12134 3.7499998 75.00001 111.44428 22.500004
37.5 73.361534 1000.709 ]
04/10/2019 02:53:12 PM INFO: Difference: [ 11.974823 647.91406 ]
04/10/2019 02:53:12 PM INFO: Number: 169
04/10/2019 02:53:12 PM INFO: Time: 13.554227686000004 seconds
我想要一个numpy数组输出:
[220.12134, 3.7499998, 75.00001, 111.44428, 22.50000437.5, 73.361534, 1000.709]
使用以下代码,我可以将其用于单行:
log_file_path = some_log.log
#regex = '\[(.*?)\]'
regex2 = '(Final Input: \[)(.*?)(\]|\n)'
with open(log_file_path, 'r') as file:
all_log_file = file.read()
a = re.findall(regex2, all_log_file)
print(a)
file.close()
#x = list(map(float, a.split()))
我得到以下输出,该输出在下一行缺少“最终输入”值(我可以将下面的输出解析为numpy数组):
[('Final Input: [', ' 220.12134 3.7499998 75.00001 111.44428 22.500004', '\n')]
答案 0 :(得分:1)
使用非贪婪说明符以及re.DOTALL
,表示.
包括\n
:
import re
regex2 = '(Final Input: \[.+?\])'
a = re.findall(regex2, text, re.DOTALL)
a
输出:
['Final Input: [ 220.12134 3.7499998 75.00001 111.44428 22.500004\n 37.5 73.361534 1000.709 ]']