我正在迭代for循环,在列表中查找关键字匹配,然后将匹配索引编译为第三个列表。我可以将索引编译为列表列表,但我希望按照匹配的项目进一步对子列表进行分组。
import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']
indices=[]
pats = [re.compile(i) for i in keywords]
for pat in pats:
for i in my_list:
for m in re.finditer(pat, i):
a =list((m.start(),m.end()))
indices.append(a)
print(indices)
返回:
[[0, 2], [0, 2], [1, 3]]
试图获得:
[[0, 2], [[0, 2], [1, 3]]]
所以很明显:
[[0, 2], [1, 3]]
是上例中'cde'的索引匹配。
答案 0 :(得分:2)
使索引成为一个词典:
import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']
indices = {}
pats = [re.compile(i) for i in keywords]
for pat in pats:
for i in my_list:
indices.setdefault(i, [])
for m in re.finditer(pat, i):
a = list((m.start(),m.end()))
indices[i].append(a)
print(indices)
,并提供:
{'cde': [[0, 2], [1, 3]], 'ab': [[0, 2]]}
这是你要找的吗?
我玩了这段代码已经有一段时间了,因为你导入itertools你也可以用它来摆脱那些丑陋的嵌套fors;)就像那样:
import re
from itertools import product
my_list = ['ab', 'cde']
keywords = ['ab', 'cd', 'de']
indices = {}
pats = [re.compile(i) for i in keywords]
for i, pat in product(my_list, pats):
indices.setdefault(i, [])
for m in re.finditer(pat, i):
indices[i].append((m.start(), m.end()))
print(indices)
不幸的是,我不能让Bakuriu的想法使用列表理解来正常工作。所以现在这对我来说似乎是最好的解决方案。
答案 1 :(得分:0)
为每个匹配创建一个list
并在此list
中累积匹配项,最后将其添加到结果中:
import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']
indices=[]
pats = [re.compile(i) for i in keywords]
for pat in pats:
for i in my_list:
sublist = []
for m in re.finditer(pat, i):
a =list((m.start(),m.end()))
sublist.append(a)
indices.append(sublist)
print(indices)
或者你可以使用列表理解:
import re, itertools
my_list = ['ab','cde']
keywords = ['ab','cd','de']
indices=[]
pats = [re.compile(i) for i in keywords]
for pat in pats:
for i in my_list:
sublist = [(m.start(), m.end()) for m in re.finditer(pat, i)]
indices.append(sublist)
print(indices)