我有以下文字:
ERROR: <C:\Includes\Library1.inc:123> This is the Error
Call Trace:
<C:\Includes\Library2.inc:456>
<C:\Includes\Library2.inc:789>
<C:\Code\Main.ext:12>
<Line:1>
ERROR: <C:\Includes\Library2.inc:2282> Another Error
Call Trace:
<C:\Code\Main.ext:34>
<C:\Code\Main.ext:56>
<C:\Code\Main.ext:78>
<Line:1>
ERROR: <C:\Code\Main.ext:90> Error Three
我想提取以下信息:
line, Error = 12, This is the Error
line, Error = 34, Another Error
line, Error = 90, Error Three
这是我有多远:
theText = 'ERROR: ...'
ERROR_RE = re.compile(r'^ERROR: <(?P<path>.*):(?P<line>[0-9]+)> (?P<error>.*)$')
mainName = '\Main.ext'
# Go through each line
for fullline in theText.splitlines():
match = self.ERROR_RE.match(fullline)
if match:
path, line, error = match.group('path'), match.group('line'), match.group('error')
if path.endswith(mainName):
callSomething(line, error)
# else check next line for 'Call Trace:'
# check next lines for mainName and get the linenumber
# callSomething(linenumber, error)
循环中剩余元素的pythonic方法是什么?
答案 0 :(得分:1)
关于如何循环剩余行的问题的直接答案是:将循环的第一行更改为
lines = theText.splitlines()
for (linenum, fullline) in enumerate(lines):
然后,在匹配之后,您可以通过查看内部循环中的lines[j]
来查看剩余的行,其中j
从linenum+1
开始并一直运行到下一个匹配。
然而,解决问题的一种更为灵活的方法是首先将文本拆分为块。有很多方法可以做到这一点,但是,作为一个以前的perl用户,我的冲动是使用正则表达式。
# Split into blocks that start with /^ERROR/ and run until either the next
# /^ERROR/ or until the end of the string.
#
# (?m) - lets '^' and '$' match the beginning/end of each line
# (?s) - lets '.' match newlines
# ^ERROR - triggers the beginning of the match
# .*? - grab characters in a non-greedy way, stopping when the following
# expression matches
# (?=^ERROR|$(?!\n)) - match until the next /^ERROR/ or the end of string
# $(?!\n) - match end of string. Normally '$' suffices but since we turned
# on multiline mode with '(?m)' we have to use '(?!\n)$ to prevent
# this from matching end-of-line.
blocks = re.findall('(?ms)^ERROR.*?(?=^ERROR|$(?!\n))', theText)
答案 1 :(得分:0)
替换它:
# else check next line for 'Call Trace:'
# check next lines for mainName and get the linenumber
# callSomething(linenumber, error)
有了这个:
match = stackframe_re.match(fullline)
if match and error: # if error is defined from earlier when you matched ERROR_RE
path, line = match.group('path'), match.group('line')
if path.endsWith(mainName):
callSomething(line, error)
error = None # don't report this error again if you see main again
请注意缩进。在循环开始之前初始化error = None
,并在第一次调用error = None
后设置callSomething
。一般来说,我建议的代码应该适用于格式正确的数据,但您可能希望对其进行改进,以便在数据与您期望的格式不匹配时不会产生误导性结果。
您必须编写stackframe_re,但它应该是匹配的RE,例如
<C:\Includes\Library2.inc:789>
当你说“循环循环中的剩余元素”时,我真的不明白你的意思。默认情况下,循环继续到其余元素。