我在python中有以下列表
texts = [
["great", "even", "for", "the", "non", "runner", "this", "sound",
"track", "was", "brilliant"],
["cannot", "recommend", "as", "a", "former", "comrade", "i", "did",
"not", "want", "to", "have", "to", "do", "this"]
]
我想浏览一下列表并计算每个单词出现在列表中的频率。
我尝试使用length()
对单个单词进行计数,结果得到2
,这意味着它不起作用。
有什么方法可以计算一个单词在列表中出现的频率,因为我打算将计数的单词存储在一个新列表中,并将其频率存储在另一个列表中。
预先感谢
答案 0 :(得分:1)
要注意的第一件事可能是texts
是一个嵌套列表,这也是为什么2
包含len(texts)
包含2个子列表的原因,而为texts
得到[word for words in texts for word in words]
的原因。
如果要遍历各个单词,则需要遍历子列表,然后遍历子列表中的单词。幸运的是,Python的列表推导可以嵌套:
word_counts = collections.Counter(word for words in texts for word in words)
关于计数:标准库中有一个专门用于此类目的的字典类:collections.Counter:
{{1}}
这将为您提供一个字典,将各个单词映射到它们的出现次数。
答案 1 :(得分:0)
您可以为此使用Counter。
texts = [
["great", "even", "for", "the", "non", "runner", "this", "sound",
"track", "was", "brilliant"],
["cannot", "recommend", "as", "a", "former", "comrade", "i", "did",
"not", "want", "to", "have", "to", "do", "this"]
]
for text in texts:
cnt = Counter()
for word in text:
cnt[word] += 1
print(cnt)
答案 2 :(得分:0)
一个班轮:
from collections import Counter
from itertools import chain
texts = [["a", "b"], ["a", "c"]]
words_count = Counter(chain(*texts))
print(words_count)
>> Counter({'a': 2, 'b': 1, 'c': 1})
答案 3 :(得分:-1)
您可以使用Counter
来计算单词数:
from collections import Counter
texts = [["great", "even", "for", "the", "non", "runner", "this", "sound","track", "was", "brilliant"],
["cannot", "recommend", "as", "a", "former", "comrade", "i", "did", "not", "want", "to", "have", "to", "do", "this"]]
for text in texts:
print(Counter(text))
# Counter({'great': 1, 'even': 1, 'for': 1, 'the': 1, 'non': 1, 'runner': 1, 'this': 1, 'sound': 1, 'track': 1, 'was': 1, 'brilliant': 1})
# Counter({'to': 2, 'cannot': 1, 'recommend': 1, 'as': 1, 'a': 1, 'former': 1, 'comrade': 1, 'i': 1, 'did': 1, 'not': 1, 'want': 1, 'have': 1, 'do': 1, 'this': 1})