请不要导入计数器。我需要编写一个函数来取出字符串中最常出现的前3个单词,并按照最常出现的顺序将它们返回到最不常出现的顺序。
所以h("the the the the cat cat cat in in hat ")
>>> ["the", "cat", "in"]
如果字符串中的单词类型少于3种:
h("the the cat")
>>> ["the", "cat"]
答案 0 :(得分:1)
频率哈希首先填充每个单词出现在给定字符串中的次数。然后根据频率哈希的计数确定前3个单词。
<强>代码强>
def h(string):
return get_top_3(get_frequency_hash(string))
def get_frequency_hash(text):
array = text.split(" ")
frequency = {}
for word in array:
try:
frequency[word] += 1
except:
frequency[word]= 1
return frequency
def get_top_3(frequency_hash):
array_of_tuples = [(k,v) for k,v in frequency_hash.items()]
sorted_array_of_tuples = sorted(array_of_tuples, key=lambda x: -x[1])
return [k for k,v in sorted_array_of_tuples[0:3]]
示例强>
h("the the the the cat cat cat in in hat")
# ['the', 'cat', 'in']
答案 1 :(得分:0)
如果我们无法导入itertools.counter
,那么让我们构建它。它只有4行代码。
import operator
def counter(l):
result = {}
for word in l:
result.setdefault(word, 0)
result[word] += 1
return result
def h(s):
scores = counter(s.split())
scores = sorted(scores.items(), key=operator.itemgetter(1))
scores = reversed(scores)
scores = list(x[0] for x in scores)
return scores[0:3]
print h("the the the the cat cat cat in in hat ")
['the', 'cat', 'in']