Question

我正在尝试编写一个匹配

所有情况的正则表达式

[[any text or char her]]

在一系列文字中。

例如：

My name is [[Sean]]
There is a [[new and cool]] thing here.

这一切都可以正常使用我的正则表达式。

data = "this is my tes string [[ that does some matching ]] then returns."
p = re.compile("\[\[(.*)\]\]")
data = p.sub('STAR', data)

问题是当我有多个匹配实例发生时：[[hello]]和[[bye]]

例如：

data = "this is my new string it contains [[hello]] and [[bye]] and nothing else"
p = re.compile("\[\[(.*)\]\]")
data = p.sub('STAR', data)

这将匹配hello的左括号和再见的右括号。我希望它能替换它们。

Answer 1

.*贪婪并且尽可能多地匹配文本，包括]]和[[，因此它会通过您的“标记”边界进行匹配。

快速解决方案是通过添加?：

让星星变得懒惰

p = re.compile(r"\[\[(.*?)\]\]")

更好（更强大，更明确但稍慢）的解决方案是明确我们无法跨越标签边界匹配：

p = re.compile(r"\[\[((?:(?!\]\]).)*)\]\]")

<强>解释

\[\[        # Match [[
(           # Match and capture...
 (?:        # ...the following regex:
  (?!\]\])  # (only if we're not at the start of the sequence ]]
  .         # any character
 )*         # Repeat any number of times
)           # End of capturing group
\]\]        # Match ]]

Answer 2

在.*?或?之后使用不匹配+＆lt; ~~ *匹配尽可能少的字符。默认是贪婪，并消耗尽可能多的字符。

p = re.compile("\[\[(.*?)\]\]")

Answer 3

您可以使用：

p = re.compile(r"\[\[[^\]]+\]\]")

>>> data = "this is my new string it contains [[hello]] and [[bye]] and nothing else"
>>> p = re.compile(r"\[\[[^\]]+\]\]")
>>> data = p.sub('STAR', data)
>>> data
'this is my new string it contains STAR and STAR and nothing else'

正则表达式查找并替换多个

3 个答案: