Question

我正在使用Python 3，并使用带有括号标记的标题字符串，其中带有一对用+分隔的名称。像这样：[John+Alice] A title here.

我一直在使用正则表达式re.search('\[(.+)\]', title)来获得标签[John+Alice]，这很好，但是遇到带有多个方括号标签的标题是一个问题：

[John+Alice] [Hayley + Serene] Another title.

这给了我[John+Alice] [Hayley + Serene]，而我更喜欢[John+Alice]和[Hayley + Serene]。

如何修改正则表达式，使所有在+和[之间有]的方括号标记？谢谢。

Answer 1

您需要使正则表达式不贪心，如下所示：

title = '[John+Alice] [Hayley + Serene] Another title.'

for t in re.findall('\[(.+?)\]', title):
    print(t)

输出

John+Alice
Hayley + Serene

如果必须包括方括号，请使用finditer：

for t in re.finditer('\[(.+?)\]', title):
    print(t.group())

输出

[John+Alice]
[Hayley + Serene]

诸如*?, +?, ??之类的非贪婪限定词匹配的文字越少越好。您可以在here中找到有关贪婪与非贪婪的更多信息。

观察

在您提到的问题中，您正在使用'\[(.+)\]'来匹配在+和[之间具有]的所有带括号的标记，但实际上它的作用不只是那。例如，对于以下示例：

title = '[John+Alice] [Hayley + Serene] [No plus text] Another title.'
re.search('\[(.+)\]', title)

返回：

[John+Alice] [Hayley + Serene] [No plus text]

因此，我的修改（使用finditer）给出：

[John+Alice]
[Hayley + Serene]
[No plus text]

因此，[No plus text]是不正确的，要解决此问题，您应该使用类似以下的方法：

title = '[John+Alice] [Hayley + Serene] [No plus text] Another title.'

for t in re.finditer('\[(.+?\+.+?)?\]', title):
    print(t.group())

输出

[John+Alice]
[Hayley + Serene]