最快,最优雅的方法来检查正则表达式表示的某些元素是否在给定列表中。
例如: 给出一个列表:
newlist = ['this','thiis','thas','sada']
regex = re.compile('th.s')
此问题:Regular Expressions: Search in list
list(filter(regex.match,newlist))
给我一个列表
['this','thas']
但是,我只想返回 True或False 。因此,上述方法效率不高,因为它会遍历newlist的所有元素。有没有类似
的方式'this' in newlist
有效而优雅地检查正则表达式表示的某些元素是否在给定列表中。
答案 0 :(得分:1)
根据Loocid的建议,您可以使用any
。我会用这样的生成器表达式来做到这一点:
newlist = ['this','thiis','thas','sada']
regex = re.compile('th.s')
result = any(regex.match(word) for word in newlist)
print(result) # True
这是map
的另一个版本,速度稍快:
result = any(map(regex.match, newlist))
答案 1 :(得分:1)
这将评估列表,直到找到第一个匹配项为止。
def search_for_match(list):
result = False
for i in newlist:
if bool(re.match(r"th.s", i)) is True:
result = True
break
return result
或者使之更一般:
def search_for_match(list, pattern):
result = False
for i in list:
if bool(re.match(pattern, i)) is True:
result = True
break
return result
newlist = ['this','thiis','thas','sada']
found = search_for_match(newlist, r"th.s")
print(found) # True
只是踢我通过定时器运行这些。我SOOO丢失:
t = time.process_time()
newlist = ['this','thiis','thas','sada']
search_for_match(newlist, r"th.s")
elapsed_time1 = time.process_time() - t
print(elapsed_time1) # 0.00015399999999998748
t2 = time.process_time()
newlist = ['this','thiis','thas','sada']
regex = re.compile('th.s')
result = any(regex.match(word) for word in newlist)
elapsed_time2 = time.process_time() - t2
print(elapsed_time2) # 1.1999999999900979e-05
t3 = time.process_time()
newlist = ['this','thiis','thas','sada']
regex = re.compile('th.s')
result = any(map(regex.match, newlist))
elapsed_time3 = time.process_time() - t3
print(elapsed_time3) # 5.999999999950489e-06
答案 2 :(得分:0)
我能想到的(除了使用任何)
next((x for x in newlist if regex.match(x)), False)
如果没有空字符串,则不返回True,但可能可以进行条件测试:)