我有一个带有单个组的正则表达式,我想用它将字符串列表映射到匹配的字符串的过滤匹配列表。目前,我正在使用以下内容:
matches = (re.findall(r'wh(at)ever', line) for line in lines)
matches = [m[0] for m in matches if m]
如何仅使用过滤器,地图和理解来更优雅地完成此操作?显然,我可以使用for循环,但我想知道它是否可以纯粹通过操作迭代器来完成。
答案 0 :(得分:1)
您可以使用地图和过滤器。这是一种方式。
matches = map(lambda x: x[0], filter(None, map(lambda x: re.findall(r'wh(at)ever', x), lines)))
如果您使用的是python3,请不要忘记最后使用list(...)
。
然而,我并不认为这里需要更多“优雅”。你正在做的事情非常好。
另一种礼貌@ juanpa.arrivillaga:
from functools import partial
list(map(itemgetter(0), filter(None, map(partial(re.findall, r'wh(at)ever'), lines))))
答案 1 :(得分:1)
There's no real advantage obfuscating your code with map, filter, or other functional tricks since a list comprehension is fast, simple and clear:
import re
lines = ['wh1atever wh1btever', 'wh2atever', '', 'wh4atever wh4btever wh4ctever']
'''Since you only want the first item for each line,
using re.findall is a waste of time, re.search is more appropriate'''
pat1 = re.compile(r'wh(..)tever')
res1 = [ m.group(1) for m in (pat1.search(line) for line in lines) if m ]
print(res1)
'''['1a', '2a', '4a']'''
'''or if there are few lines, you can join them and use re.findall this time,
with a pattern that consumes the end of the line'''
pat2 = re.compile(r'wh(..)tever.*')
res2 = pat2.findall("\n".join(lines))
print(res2)
'''['1a', '2a', '4a']'''