脚本:
import re
matches = ['hello', 'hey', 'hi', 'hiya']
def check_match(string):
for item in matches:
if re.search(item, string):
print 'Match found: ' + string
else:
print 'Match not found: ' + string
check_match('hey')
check_match('hello there')
check_match('this should not match')
check_match('oh, hiya')
输出:
Match not found: hey
Match found: hey
Match not found: hey
Match not found: hey
Match found: hello there
Match not found: hello there
Match not found: hello there
Match not found: hello there
Match not found: this should not match
Match not found: this should not match
Match found: this should not match
Match not found: this should not match
Match not found: oh, hiya
Match not found: oh, hiya
Match found: oh, hiya
Match found: oh, hiya
我不明白有各种各样的事情,对于初学者来说,每个字符串在这个输出中被搜索四次,一些返回两个作为找到的匹配,有三个。我不确定我的代码中有什么问题会导致这种情况发生,但是有人可以尝试看看有什么问题吗?
预期的输出是:
Match found: hey
Match found: hello there
Match not found: this should not match
Match found: oh, hiya
答案 0 :(得分:5)
它的行为不正确,这是你对re.search(...)
的误解。
查看输出后的评论:
Match not found: hey # because 'hello' is not in 'hey'
Match found: hey # because 'hey' is in 'hey'
Match not found: hey # because 'hi' is not in 'hey'
Match not found: hey # because 'hiya' is not in 'hey'
Match found: hello there # because 'hello' is in 'hello there'
Match not found: hello there # because 'hey' is not in 'hello there'
Match not found: hello there # because 'hi' is not in 'hello there'
Match not found: hello there # because 'hiya' is not in 'hello there'
Match not found: this should not match # because 'hello' is not in 'this should not match'
Match not found: this should not match # because 'hey' is not in 'this should not match'
Match found: this should not match # because 'hi' is in 'this should not match'
Match not found: this should not match # because 'hiya' is not in 'this should not match'
Match not found: oh, hiya # because 'hello' is not in 'oh, hiya'
Match not found: oh, hiya # because 'hey' is not in 'oh, hiya'
Match found: oh, hiya # because 'hi' is in 'oh, hiya'
Match found: oh, hiya # because 'hiya' is in 'oh, hiya'
如果您不希望在输入hi
的情况下匹配模式oh, hiya
,则应该在模式周围包含字边界:
\bhi\b
这将使其仅匹配由其他字母hi
包围的well hiya there
而非 的出现次数与{{1}不匹配但是\bhi\b
会)。
答案 1 :(得分:2)
试试这个 - 它更简洁,它会标记多个匹配:
import re
matches = ['hello', 'hey', 'hi', 'hiya']
def check_match(string):
results = [item for item in matches if re.search(r'\b%s\b' % (item), string)]
print 'Found %s' % (results) if len(results) > 0 else "No match found"
check_match('hey')
check_match('hello there')
check_match('this should not match')
check_match('oh, hiya')
check_match('xxxxx xxx')
check_match('hello and hey')
给出:
Found ['hey'] Found ['hello'] No match found Found ['hiya'] No match found Found ['hello', 'hey']
答案 2 :(得分:0)
你得到4个搜索和4个输出,因为你循环遍历一个数组,搜索并输出数组中每个元素的东西......
答案 3 :(得分:0)
for循环是针对每个'匹配'检查字符串,并打印出找到或未找到的每个匹配。你真正想要的是查看任何匹配是否匹配,然后打印出一个“找到”或“未找到”。我实际上并不知道python,因此语法可能会关闭。
for item in matches:
if re.search(item, string):
found = true
if found:
print 'Match found: ' + string
else:
print 'Match not found: ' + string
`