我有以下列表:
data_items = ['abc','123data','dataxyz','456','344','666','777','888','888', 'abc', 'xyz']
我有一个搜索项列表:
search = ['abc','123','xyz','456']
我想使用搜索列表迭代data_items以进行匹配,并构建一个为每个匹配提供计数的基本结构。 e.g。
counts = ['abc':'2', '123':'1', 'xyz':'2'.........]
最好的方法是什么?
答案 0 :(得分:4)
您可以使用re.search
和collections.Counter
,例如:
import re
from collections import Counter
data_items = ['abc','123data','dataxyz','456','344','666','777','888','888', 'abc', 'xyz']
search = ['abc','123','xyz','456']
to_search = re.compile('|'.join(sorted(search, key=len, reverse=True)))
matches = (to_search.search(el) for el in data_items)
counts = Counter(match.group() for match in matches if match)
# Counter({'abc': 2, 'xyz': 2, '123': 1, '456': 1})
答案 1 :(得分:1)
看起来你也需要部分匹配。下面的代码很直观,但可能效率不高。并假设你对dict结果没问题。
>>> data_items = ['abc','123data','dataxyz','456','344','666','777','888','888', 'abc', 'xyz'] >>> search = ['abc','123','xyz','456'] >>> result = {k:0 for k in search} >>> for item in data_items: for search_item in search: if search_item in item: result[search_item]+=1 >>> result {'123': 1, 'abc': 2, 'xyz': 2, '456': 1}
答案 2 :(得分:0)
counts={}
for s in search:
lower_s=s.lower()
counts[lower_s]=str(data_items.count(lower_s))
如果您对使用字典感到满意(因为您说结构,它是更好的选择)。