Question

我正在使用python＆＃39; re＆＃39;我正在寻找匹配import requests webAddress = "https://projects.fivethirtyeight.com/2018-nba-predictions/" r = requests.get(webAddress) print(repr(r.text))但跳过[some text]的模式。例如，如果输入是这样的：

[[another text]]

然后输出如下：

'[aaa]bb[[cd]]'

我已经尝试了bb[[cd]]和r'(\[){1}(.*?)(\]){1}'，但没有一个能够正常使用。有什么想法吗？

Answer 1

您可以使用Positive Lookbehind (?<=(\[aaa\]))

import re
pattern=r'(?<=(\[aaa\])).+'
text='[aaa]bb[[cd]]'

match=re.search(pattern,text)
print(match.group())

输出：

bb[[cd]]

说明：

\[ matches the character [ literally (case sensitive)

aaa matches the characters aaa literally (case sensitive)

\] matches the character ] literally (case sensitive)
1st Capturing Group (\[aaa\])

(?<=foo)    Lookbehind  Asserts that what immediately precedes the current position in the string is foo

P.S：如果同一行中有多个匹配，则使用finditer而不是re.search

Answer 2

你需要两件事：

确保方括号之间没有其他方括号：可以使用不带方括号的字符类轻松完成[^\]\[]
确保在开口方括号前没有其他开口方括号，并且在关闭方括号之后没有关闭方括号（请注意，如果您认为第二次测试可以是可选的第一个足够）。要做到这一点，您可以使用负向lookbehind (?<!...) （零宽度断言，如果子模式从当前位置失败，则向后检查）和负向前瞻(?!...) （零宽度断言，如果子模式从当前位置失败，则向前检查）。

结果：

r'(?<!\[)\[([^\]\[]*)\](?!\])'

请注意，零宽度断言意味着不会消耗子模式中描述的字符。换句话说，[之前的字符如果您将此模式与]一起使用，则re.sub未被替换。

python重新搜索特定字符的一个重复

2 个答案: