Question

我正在尝试使用正则表达式匹配一些字符串。我想搜索的是任何谈论某人孩子的字符串。比如：我的儿子，女儿，女儿等等。

所以我用Python编写了这个：

re.match(r'\b(my|our)\b \b(son|daughter|children|child|kid)s?', 'me and my son were')

但是有些它与测试句中的my son不匹配。返回None

我在这里测试了这个正则表达式：https://regex101.com/r/ChAy9e/1并且它工作正常（测试用例中的第5行）。

我无法弄清楚我做错了什么。

谢谢！

Answer 1

match仅在字符串的开头匹配正则表达式;您需要使用findall方法

>>> re.findall(r'\b(my|our)\b \b(son|daughter|children|child|kid)s?', 'me and my son were')
[('my', 'son')]

<强>匹配尝试在字符串的开始处应用模式，然后返回匹配对象，如果未找到匹配则为None。

Answer 2

正如温尼所说，你需要re.findall。但是，如果您希望将这些短语作为一个元素，那么您将要稍微修改您的正则表达式。尝试：

In [1]: re.findall(r'\b(?:my|our)\s+(?:son|daughter|kid)s?|children|child\b', 'me and my son were')
Out[1]: ['my son']

删除捕获组，以便一次捕获单个短语。我还优化了你的正则表达式，因为你不需要寻找childrens和childs（这是不正确的语法！）。

<强>详情

\b          # word boundary
(?:         # open non-capture group
    my          
    |           # 'or' operation
    our         
) 
\s+         # whitespace - one or more
(?:         # open non-capture group
    son        
    |
    daughter
    |
    kid
)
s?          # 's' optional           
|
children
|
child
\b          # word boundary

使用带有\ b的正则表达式匹配字符串

2 个答案: