我试图从多个列表中的列表中找到单词出现的总和。列表中的列表对象很大,所以我只使用了一个虚拟实例
multiple=[['apple','ball','cat']['apple','ball']['apple','cat'].......]
words=['apple','ball','cat','duck'......]
word = 'apple'
cnt = Counter()
total = 0
for i in multiple:
for j in i:
if word in j:
cnt[word] +=1
total += cnt[word]
我想要这样的输出:
{'apple':3,'ball':2,'cat':2}
答案 0 :(得分:2)
您只需提供Counter
生成器表达式:
cnt = Counter(word for sublist in multiple for word in sublist)
cnt
Out[40]: Counter({'apple': 3, 'ball': 2, 'cat': 2})
sum(cnt.values())
Out[41]: 7
我没有看到你的words
列表的重点。你没用过它。
如果您需要过滤掉words
以外的字词,请words
为set
,不为list
。
words = {'apple','ball','cat','duck'}
cnt = Counter(word for sublist in multiple for word in sublist if word in words)
否则,你应该在O(n)操作中获得O(n ** 2)行为。
答案 1 :(得分:0)
这适用于Python 2.7和Python 3.x:
from collections import Counter
multiple=[['apple','ball','cat'],['apple','ball'],['apple','cat']]
words=['apple','ball','cat','duck']
cnt = Counter()
total = 0
for i in multiple:
for word in i:
if word in words:
cnt[word] +=1
total += 1
print cnt #: Counter({'apple': 3, 'ball': 2, 'cat': 2})
print dict(cnt) #: {'apple': 3, 'ball': 2, 'cat': 2}
print total #: 7
print sum(cnt.values()) #: 7
在Python 2.x中,您应该使用.itervalues()
而不是.values()
,即使两者都有效。
基于roippi的回答,这是一个更短的解决方案:
from collections import Counter
multiple=[['apple','ball','cat'],['apple','ball'],['apple','cat']]
cnt = Counter(word for sublist in multiple for word in sublist)
print cnt #: Counter({'apple': 3, 'ball': 2, 'cat': 2})