Question

是否可以计算Python中列表中出现的元素的相对频率？

例如：

['apple', 'banana', 'apple', 'orange'] # apple for example would be 0.5

Answer 1

您可以使用NLTK：

import ntlk
text = ['apple', 'banana', 'apple', 'orange']
fd = nltk.FreqDist(text)

查看tutorial in the book how to和the source code

或者，您可以使用计数器：

from collections import Counter
text = ['apple', 'banana', 'apple', 'orange']
c = Counter(text)

Answer 2

这个简单的代码可以完成工作，返回一个元组列表，但是您可以轻松地对其进行修改。

index.ts

它将返回每个单词的相对频率，如下所示：

lst = ['apple', 'banana', 'apple', 'orange']
counts = [(word, lst.count(word) / len(lst)) for word in set(lst)]

请注意：

对 set（lst）进行迭代，以避免重复
用 len（lst）除以 lst.count 以获得相对频率

Answer 3

只需计算元素在列表中出现的次数，即可轻松完成此操作。

def relative_frequency(lst, element):
    return lst.count(element) / float(len(lst))

words = ['apple', 'banana', 'apple', 'orange']
print(relative_frequency(words, 'apple'))

Answer 4

创建一个字典，其中单词为键，出现次数为值。拥有此词典后，您可以将每个值除以单词列表的长度。

Answer 5

以下代码片段完全符合问题的要求：给一个Counter（）对象，返回一个字典，该字典包含相同的键，但相对频率为值。无需第三方库。

def counter_to_relative(counter):
    total_count = sum(counter.values())
    relative = {}
    for key in counter:
        relative[key] = counter[key] / total_count
    return relative

Python中的相对频率

5 个答案: