Question

我正在尝试查找句子“ DELETED-LVHEAP = 258/64806/65937 RSS = 66621”，需要将其标识为“ --LVHEAP”，然后在找到所有这些句子之后，我想输出“ 66621”。

我用过：

text ="DELETED -- LVHEAP = 258/64806/65937  RSS = 66621"

RSS = re.findall("(?<=-- LVHEAP = )\d+\\S+\\S+(?<=RSS =)\d+",text)

它的输出为空，有人可以帮我吗？

Answer 1

我怀疑您打算让原始正则表达式中的\S匹配非空格字符，但是\\的意思是“ match \”，这导致S的意思是只是一个文字“ S”，因为之前的\被\\占用了。

但是即使您修复了该问题，原始正则表达式也存在其他问题。这是一个更简单的匹配您对要做什么的描述的

-- LVHEAP = [\d/]+  RSS = (\d+)

这意味着：

-- LVHEAP =    a line containing "-- LVHEAP =  "
[\d/]+         followed by one or more digits and '/' slashes
  RSS =        followed by "  RSS = "
(\d+)          followed by one or more digits, which are captured

请参见https://regex101.com/r/LNuF5K/1

更简单的正则表达式可以工作，例如：

-- LVHEAP = [A-Z\d/= ]+ (\d+)

例如，“ RSS”可能是其他全大写字母。

Answer 2

这样对您有用吗？

import re

outputs = []
for line in lines:
    if "-- LVHEAP" in line:
        matches = re.findall("RSS = \d+", line)
        matches = [ int(match.split(" = ")[1]) for match in matches ]
        outputs.append(matches)

re.findall忽略两个模式之间的一些变量

2 个答案: