Question

我正在寻找具有以下属性的可变短语的Python正则表达式：（为示例起见，我们假设此处的变量短语采用值and。但是请注意，我需要以一种能够传递起and角色的事物的方式进行此操作。作为变量，我将其称为phrase。）

应匹配：this_and，this.and，(and)，[and]，and^，;And等。

不应匹配：land，andy

这是我到目前为止所尝试的（phrase在扮演and的角色）：

pattern = r"\b  " + re.escape(phrase.lower()) + r"\b"

这似乎符合我的所有要求，除了它不匹配带有下划线的单词，例如\_hello，hello\_，hello_world。

编辑：理想情况下，我想使用标准库re模块而不是任何外部软件包。

Answer 1

以下是可能解决该问题的正则表达式：

正则表达式

(?<=[\W_]+|^)and(?=[\W_]+|$)

示例

# import regex

string = 'this_And'
test = regex.search(r'(?<=[\W_]+|^)and(?=[\W_]+|$)', string.lower())
print(test.group(0))
# prints 'and'

# No match
string = 'Andy'
test = regex.search(r'(?<=[\W_]+|^)and(?=[\W_]+|$)', string.lower())
print(test)
# prints None

strings = [ "this_and", "this.and", "(and)", "[and]", "and^", ";And"]
[regex.search(r'(?<=[\W_]+|^)and(?=[\W_]+|$)', s.lower()).group(0) for s in strings if regex.search(r'(?<=[\W_]+|^)and(?=[\W_]+|$)', s.lower())]
# prints ['and', 'and', 'and', 'and', 'and', 'and']

说明

[\W_]+表示我们仅接受?<=之前（?=或之后（and）_之后的非单词符号（下划线|^（单词符号该）被接受。 |$和regex允许匹配项位于字符串的边缘。

修改

正如我的评论中所提到的，模块re不会产生后视长度可变的错误（与# This works fine # import regex word = 'and' pattern = r'(?<=[\W_]+|^){}(?=[\W_]+|$)'.format(word.lower()) string = 'this_And' regex.search(pattern, string.lower())相对）。

re

但是，如果您坚持使用(?<=[\W_])and(?=[\W_]+|$)|^and(?=[\W_]+|$)，那么在我的脑海中，我建议将后面的内容分成两个and，这样一来，字符串以# This also works fine # import re word = 'and' pattern = r'(?<=[\W_]){}(?=[\W_]+|$)|^{}(?=[\W_]+|$)'.format(word.lower(), word.lower()) string = 'this_And' re.search(pattern, string.lower())开头也被捕获了。

public void render() {
        glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT); 
        glClearColor(0.925f, 0.98f, 0.988f, 1f);

        glPushMatrix();
        game.tickRender();
        glPopMatrix();
    }

Answer 2

您可以使用

r'(?<![^\W_])and(?![^\W_])'

请参见regex demo。使用re.I标志进行编译以启用不区分大小写的匹配。

详细信息

(?<![^\W_])-前面的字符不应为字母或数字字符
and-一些关键字
(?![^\W_])-下一个字符不能为字母或数字

Python demo：

import re
strs = ['this_and', 'this.and', '(and)', '[and]', 'and^', ';And', 'land', 'andy']
phrase = "and"
rx = re.compile(r'(?<![^\W_]){}(?![^\W_])'.format(re.escape(phrase)), re.I)
for s in strs:
    print("{}: {}".format(s, bool(rx.search(s))))

输出：

this_and: True
this.and: True
(and): True
[and]: True
and^: True
;And: True
land: False
andy: False

正则表达式在单词边界（包括下划线）处匹配标点符号

2 个答案: