对于我正在处理的作业,我试图按照文本中单词的频率对文本中的单词进行排序。我有一个功能几乎可以完成我想做但不完全的功能。以下是我的代码:
require 'pry'
def top_words(words)
word_count = Hash.new(0)
words = words.split(" ")
words.each { |word| word_count[word] += 1 }
word_count = word_count.sort_by do |words, frequencies|
frequencies
end
binding.pry
word_count.reverse!
word_count.each { |word, frequencies| puts word + " " + frequencies.to_s }
end
words = "1st RULE: You do not talk about FIGHT CLUB.
2nd RULE: You DO NOT talk about FIGHT CLUB.
3rd RULE: If someone says 'stop' or goes limp, taps out the fight is over.
4th RULE: Only two guys to a fight.
5th RULE: One fight at a time.
6th RULE: No shirts, no shoes.
7th RULE: Fights will go on as long as they have to.
8th RULE: If this is your first night at FIGHT CLUB, you HAVE to fight."
由于某种原因,我的binding.pry上面的sort_by方法正在将我的Hash结构更改为数组的数组。为什么? 我想做的是在哈希中对单词进行排序,然后从哈希中获取前三个单词。我还没弄清楚如何做到这一点,但我很确定一旦我对数组问题的数组进行排序,我就能做到这一点。
现在,我想我可以使用.each和array [0] .each {| stuff |放东西[0] +东西[1]}但我不认为这是最有效的方式。有什么建议吗?
答案 0 :(得分:1)
出于某种原因,我的binding.pry上方的
sort_by
方法正在将我的Hash结构更改为数组的数组。为什么?
说明如下:
sort_by { |obj| block } → array
方法总是提供数组。
sort_by
的当前实现生成一个包含原始集合元素和映射值的元组数组。当 keysets 很简单时,这使得sort_by相当昂贵。
现在,在你的情况下,word_count
是Hash
个对象,因此sort_by
正在给你 - [[key1,val],[key2,val2],..]
。这就是你得到数组的原因。
我想要做的是对哈希中的单词进行排序,然后从哈希中获取前三个单词。我还没弄明白如何做到这一点,但我很确定一旦我对数组问题的数组进行了排序,我就可以做到这一点。
是的,可能。
sorted_array_of_array = word_count.sort_by do |words, frequencies| frequencies }
top_3_hash = Hash[ sorted_array_of_array.last(3) ]
我会编写如下代码:
def top_words(words)
# splitting the string words on single white space to create word array.
words = words.split(" ")
# creating a hash, which will have key as word and value is the number of times,
# that word occurred in a sentence.
word_count = words.each_with_object(Hash.new(0)) { |word,hash| hash[word] += 1 }
# sorting the hash, to get a descending order sorted array of array
sorted_array_of_array = word_count.sort_by { |words, frequencies| frequencies }
# top 3 word/frequency is taken from the sorted list. Now reading them from last
# to show the output as first top,second top and so on..
sorted_array_of_array.last(3).reverse_each do |word, frequencies|
puts "#{word} has #{frequencies}"
end
end