我正在编写一个函数,该函数返回名为wordlist.txt的文件中前10个最常用的单词长度,该文件包含从a到z的所有单词。我编写了一个函数(名为“ value_length”),该函数返回某个列表中每个单词的长度的列表。我还在字典中应用了Counter模块(以单词的长度作为键,以这些长度的频率作为值)来解决该问题。
from collections import Counter
def value_length(seq):
'''This function takes a sequence and returns a list that contains
the length of each element
'''
value_l = []
for i in range(len(seq)):
length = len(seq[i])
value_l.append(length)
print(value_l)
# open the txt file
fileobj = open("wordlist.txt", "r")
file_content = []
# create a list with length of every single word
for line in fileobj:
file_content.append(line)
wordlist_lengths = value_length(file_content)
# create a dictionary that has the number of occurrence of each length as key
occurrence = {x:file_content.count(x) for x in file_content}
c = Counter(occurrence)
c.most_common(10)
但是,每当我运行此代码时,都不会得到我想要的结果;我只能从value_length函数获得结果(即具有每个单词长度的极长列表)。换句话说,Python不会解释字典。我不明白我的错误是什么。
答案 0 :(得分:0)
无需将长度存储在列表中,也无需使用列表的count
方法;您已经导入了Counter
,因此只需使用它即可进行计数。
c = Counter()
for word in seq:
length = len(word)
c[length] += 1
答案 1 :(得分:0)
此代码将找到每个列表项的长度并对它们进行排序。然后,您可以简单地从出现次数+列表中出现次数中得出一个元组:
words = ["Hi", "bye", "hello", "what", "no", "crazy", "why", "say", "imaginary"]
lengths = [len(w) for w in words]
print(lengths)
sortedLengths = sorted(lengths)
print(sortedLengths)
countedLengths = [(w, sortedLengths.count(w)) for w in sortedLengths]
print(countedLengths)
此打印:
[2, 3, 5, 4, 2, 5, 3, 3, 9]
[2, 2, 3, 3, 3, 4, 5, 5, 9]
[(2, 2), (2, 2), (3, 3), (3, 3), (3, 3), (4, 1), (5, 2), (5, 2), (9, 1)]