我正在尝试创建一个仅从特定正则表达式中提取两位数整数的函数。
def extract_number(message_text):
regex_expression = 'What are the top ([0-9]{2}) trends on facebook'
regex= re.compile(regex_expression)
matches = regex.finditer(message_text)
for match in matches:
return match.group()
# if there were no matches, return None
return None
这样当我打印
时message_text= 'What are the top 54 trends on facebook today'
print(extract_number(message_text))
我会得到54号。 如果我在下面写下面的内容,我会得到我放入的任何字符(。+)...为什么它对数字不起作用?
def extract_number(message_text):
regex_expression = 'What are the top (.+) trends on facebook'
regex= re.compile(regex_expression)
matches = regex.finditer(message_text)
for match in matches:
return match.group()
message_text= 'What are the top fifty trends on facebook today'
print(extract_number(message_text))
答案 0 :(得分:1)
两者你的片段的唯一问题是你没有返回感兴趣的捕获组结果,而是整体匹配:
return match.group()
与return match.group(0)
相同,即它会返回整体匹配,在您的情况下是整个输入字符串。< / p>
相比之下,您需要索引1
,即第一个捕获组 - (...)
中包含的第一个子表达式,([0-9]{2})
- 匹配:< / p>
return match.group(1)
把它们放在一起:
def extract_number(message_text):
regex_expression = 'What are the top ([0-9]{2}) trends on facebook'
regex= re.compile(regex_expression)
matches = regex.finditer(message_text)
# (See bottom of this answer for a loop-less alternative.)
for match in matches:
return match.group(1) # index 1 returns what the 1st capture group matched
# if there were no matches, return None
return None
message_text= 'What are the top 54 trends on facebook today'
print(extract_number(message_text))
这产生了所需的输出:
54
注意:正如@EvanL00指出的那样,假设只需要 1 匹配,那么使用regex.finditer()
和后续for
循环无条件地返回第一个迭代是不必要的,可能会模糊代码的意图;更简单明了的方法是:
match = regex.search(message_text) # Get first match only.
if match:
return match.group(1)
答案 1 :(得分:0)
这适用于数字/字符串:
def extract_number(message_text):
regex_expression = 'What are the top ([a-zA-Z0-9]+) trends on facebook'
regex= re.compile(regex_expression)
matches = regex.findall(message_text)
if matches:
return matches[0]
message_text= 'What are the top fifty trends on facebook today'
print(extract_number(message_text))
message_text= 'What are the top 50 trends on facebook today'
print(extract_number(message_text))
message_text= 'What are the top -- trends on facebook today'
print(extract_number(message_text))
输出:
fifty
50
None