Question

我试图让所有子字符串以字符'm'开头，并且有5个字符。我尝试使用此代码，但它不起作用。

<code>
import re
str1 = "mouseeee mother mouse is beautiful creation"
r = re.compile("m[a-z]{5}$")
print(r.findall(str1))</code>

Answer 1

要提取以小m开头并且其中包含5个字符的字词，请使用

import re
str1 = "mouseeee mother mouse is beautiful creation"
r = re.compile(r"\bm[a-z]{5}\b")
print(r.findall(str1)) # => ['mother']

请参阅Python demo。 mouseeee有超过6个字母，mouse在初始m后有4个字母，因此不匹配。

模式详情：

\b - 字边界
m - m
[a-z]{5} - 5个ASCII小写字母
\b - 一个单词边界。

要使模式不区分大小写，请将re.I标记传递给re.compile。

Answer 2

编辑：WiktorStribiżew添加了建议

如果你想从字母m开始获得长度为6的所有单独的单词，你可以使用：

r = re.compile(r"(?<!\w)(m[a-z]{5})(?!\w)")

这确保了匹配前后的非letter-char（具有负回顾和前瞻），其中包含字母m，后跟5个其他字母。使用\b作为单词边界可以简化负向前瞻，如其他答案中所示。

>>> import re
>>> str1 = "mouseeee mother mouse is beautiful creation"
>>> r = re.compile("(?<= )(m[a-z]{5})(?= )")
>>> print(r.findall(str1))
['mother']

Answer 3

你可能想要正则表达式\bm[a-z]{5}\b（\ b是单词边界转义序列）

目前，在你的正则表达式中，$表示字符串的结尾。另外，没有任何东西阻止匹配从一个单词的中间开始。

>>> str1 = "mouseeee mother mouse is beautiful creation"
>>> r = re.compile(r"\bm[a-z]{5}\b")
>>> r.findall(str1)
['mother']

如何使用正则表达式限制python中的子串大小

3 个答案: