Question

是否可以使用re模块在Python中查找和替换相同的行？即也返回已被替换的内容（以类似于re.subn如何返回替换次数的方式）。

例如，我有"FOO BAR PART 1"形式的文字，我想要做的是将其转换为"FOO BAR"和"PART 1"。

我能想到的就是使用类似的东西：

title_old = "FOO BAR PART 1"
parts_found = re.findall(r"PART [0-9]*$", title_old )   ## i.e. search for term
if parts_found != []:
    part_string = parts_found[0]
    title_new = re.sub(re.escape(parts_found[0]),"",title_old )  ## If that term exists, then substitute it.

Answer 1

根据PART之前的空格进行拆分。

re.split(r'\s+(?=PART\s\d*$)', s)

示例：的

>>> import re
>>> s = "FOO BAR PART 1"
>>> re.split(r'\s+(?=PART\s*\d*$)', s)
['FOO BAR', 'PART 1']
>>> s = "PART 1"
>>> re.split(r'\s+(?=PART\s*\d*$)', s)
['PART 1']

Answer 2

您可以传递单独的方法而不是替换模式，并将匹配对象传递给该方法。您可以声明变量以跟踪那里所有被替换的文本。

请参阅re.sub reference：

如果 repl 是一个函数，则会针对每个非重叠的模式调用它。该函数接受一个匹配对象参数，并返回替换字符串。

import re

replacements = []
def repl(m):
    replacements.append(m.group(0))  # Add found match to the list
    return "";                       # We remove the match found

title_old = "FOO BAR PART 1"
print(re.sub(r"PART [0-9]*$", repl, title_old))
print(replacements)

请参阅demo

结果：

FOO BAR ['PART 1']

Answer 3

试试这个：

import re

title_old = "FOO BAR PART 1"
title_new = re.sub(r" PART \d+$", "", title_old)

有关详细信息，请参阅re.sub() documentation。

使用Python重新查找并在一行中进行子操作

3 个答案: