嵌套数据结构:为什么我的哈希变成数组数组?

时间:2014-03-15 23:24:23

标签: ruby arrays hash

对于我正在处理的作业,我试图按照文本中单词的频率对文本中的单词进行排序。我有一个功能几乎可以完成我想做但不完全的功能。以下是我的代码:

require 'pry'
def top_words(words)
  word_count = Hash.new(0)
  words = words.split(" ")
  words.each { |word| word_count[word] += 1 }
  word_count = word_count.sort_by do |words, frequencies|
    frequencies
  end
  binding.pry
  word_count.reverse!
  word_count.each { |word, frequencies| puts word + " " + frequencies.to_s }
end

words = "1st RULE: You do not talk about FIGHT CLUB.
2nd RULE: You DO NOT talk about FIGHT CLUB.
3rd RULE: If someone says 'stop' or goes limp, taps out the fight is over.
4th RULE: Only two guys to a fight.
5th RULE: One fight at a time.
6th RULE: No shirts, no shoes.
7th RULE: Fights will go on as long as they have to.
8th RULE: If this is your first night at FIGHT CLUB, you HAVE to fight."

由于某种原因,我的binding.pry上面的sort_by方法正在将我的Hash结构更改为数组的数组。为什么? 我想做的是在哈希中对单词进行排序,然后从哈希中获取前三个单词。我还没弄清楚如何做到这一点,但我很确定一旦我对数组问题的数组进行排序,我就能做到这一点。

现在,我想我可以使用.each和array [0] .each {| stuff |放东西[0] +东西[1]}但我不认为这是最有效的方式。有什么建议吗?

1 个答案:

答案 0 :(得分:1)

  

出于某种原因,我的binding.pry上方的sort_by方法正在将我的Hash结构更改为数组的数组。为什么?

说明如下:

sort_by { |obj| block } → array方法总是提供数组

  

sort_by的当前实现生成一个包含原始集合元素和映射值的元组数组。当 keysets 很简单时,这使得sort_by相当昂贵。

现在,在你的情况下,word_countHash个对象,因此sort_by正在给你 - [[key1,val],[key2,val2],..]。这就是你得到数组的原因。

  

我想要做的是对哈希中的单词进行排序,然后从哈希中获取前三个单词。我还没弄明白如何做到这一点,但我很确定一旦我对数组问题的数组进行了排序,我就可以做到这一点。

是的,可能。

sorted_array_of_array = word_count.sort_by do |words, frequencies| frequencies }
top_3_hash = Hash[ sorted_array_of_array.last(3) ]

我会编写如下代码:

def top_words(words)
  # splitting the string words on single white space to create word array.
  words = words.split(" ")
  # creating a hash, which will have key as word and value is the number of times,
  # that word occurred in a sentence.
  word_count = words.each_with_object(Hash.new(0)) { |word,hash| hash[word] += 1 }
  # sorting the hash, to get a descending order sorted array of array
  sorted_array_of_array = word_count.sort_by { |words, frequencies| frequencies }
  # top 3 word/frequency is taken from the sorted list. Now reading them from last
  # to show the output as first top,second top and so on..
  sorted_array_of_array.last(3).reverse_each do |word, frequencies| 
    puts "#{word} has #{frequencies}"
  end
end