Question

我正在编写一个函数，该函数返回名为wordlist.txt的文件中前10个最常用的单词长度，该文件包含从a到z的所有单词。我编写了一个函数（名为“ value_length”），该函数返回某个列表中每个单词的长度的列表。我还在字典中应用了Counter模块（以单词的长度作为键，以这些长度的频率作为值）来解决该问题。

from collections import Counter

def value_length(seq):
    '''This function takes a sequence and returns a list that contains 
    the length of each element
    '''
    value_l = []
    for i in range(len(seq)):
        length = len(seq[i])
        value_l.append(length)
    print(value_l) 

# open the txt file 
fileobj = open("wordlist.txt", "r")
file_content = []

# create a list with length of every single word   
for line in fileobj:
    file_content.append(line)
    wordlist_lengths = value_length(file_content)

# create a dictionary that has the number of occurrence of each length as key
occurrence = {x:file_content.count(x) for x in file_content}
c = Counter(occurrence)
c.most_common(10)

但是，每当我运行此代码时，都不会得到我想要的结果；我只能从value_length函数获得结果（即具有每个单词长度的极长列表）。换句话说，Python不会解释字典。我不明白我的错误是什么。

Answer 1

无需将长度存储在列表中，也无需使用列表的count方法；您已经导入了Counter，因此只需使用它即可进行计数。

c = Counter()
for word in seq:
    length = len(word)
    c[length] += 1

Answer 2

此代码将找到每个列表项的长度并对它们进行排序。然后，您可以简单地从出现次数+列表中出现次数中得出一个元组：

words = ["Hi", "bye", "hello", "what", "no", "crazy", "why", "say", "imaginary"]

lengths = [len(w) for w in words]
print(lengths)
sortedLengths = sorted(lengths)
print(sortedLengths)

countedLengths = [(w, sortedLengths.count(w)) for w in sortedLengths]
print(countedLengths)

此打印：

[2, 3, 5, 4, 2, 5, 3, 3, 9]
[2, 2, 3, 3, 3, 4, 5, 5, 9]
[(2, 2), (2, 2), (3, 3), (3, 3), (3, 3), (4, 1), (5, 2), (5, 2), (9, 1)]

单词列表中排名前10位的最常用单词长度

2 个答案: