Question

可能重复：
Python Regular Expression Matching: ## ##

我已经问过这个问题了，但是让我更好地重申一下......我正在逐行搜索文件，发现## random_string ##。它适用于多个＃...

的情况

pattern='##(.*?)##'
prog=re.compile(pattern)

string='lala ###hey## there'
result=prog.search(string)

print re.sub(result.group(1), 'FOUND', line)

期望的输出：

"lala #FOUND there"

相反，我得到以下内容，因为它抓住了整个###嘿##：

"lala FOUND there"

那么我如何忽略任何数量的＃或者只是捕获“## string ##”。

Answer 1

你的问题在于内心的匹配。您使用.，它匹配任何字符，而不是行结尾，这意味着它也匹配#。因此，当它获得###hey##时，它会将(.*?)与#hey匹配。

简单的解决方案是从匹配集中排除#字符：

prog = re.compile(r'##([^#]*)##')

Protip：对正则表达式使用原始字符串（例如r''），这样你就不必为反斜杠转义而疯狂。

尝试在哈希中允许#会使更多更复杂。

（编辑：早期版本没有正确处理前导/尾随###。）

Answer 2

>>> s='lala ###hey## there'
>>> re.sub("(##[^#]+?)#+","FOUND",s)
'lala #FOUND there'

>>> s='lala ###hey## there blah ###### hey there again ##'
>>> re.sub("(##[^#]+?)#+","FOUND",s)
'lala #FOUND there blah ####FOUND'

Answer 3

import re

pattern = "(##([^#]*)##)"
prog = re.compile(pattern)

str = "lala ###hey## there"
result = prog.search(string)

line = "lala ###hey## there"

print re.sub(result.group(0), "FOUND", line)

诀窍是说（不是＃）而不是任何东西。这也假设

line = "lala #### there"

结果：

line = "lala FOUND there"

Python正则表达式匹配## ##

3 个答案: