在散列中查找具有最高值的N个键,保持顺序

时间:2012-02-27 02:48:13

标签: ruby sorting hash

在Ruby脚本中,

  • 我有一个哈希,其句子为关键词,相关性分数为值。
  • 我想检索一个包含N个最相关句子(最高分)的数组。
  • 我想保留这些句子的提取顺序。

假设:

hash = {
  'This is the first sentence.' => 5,
  'This is the second sentence.' => 1,
  'This is the last sentence.' => 6
}

然后:

choose_best(hash, 2)

应该返回:

['This is the first sentence.', 'This is the last sentence.']

我能想到的所有方法都涉及重新排序哈希,从而失去了句子的顺序。解决这个问题的最佳方法是什么?

5 个答案:

答案 0 :(得分:2)

def extract hash, n
  min = hash.values.sort[-n]
  a = []
  i = 0
  hash.each{|k, v| (a.push(k) and i += 1) if i < n and v >= min}
  a
end

答案 1 :(得分:1)

尝试以下怪物:

hash.map(&:reverse).each_with_index
                   .sort_by(&:first).reverse
                   .take(2)
                   .sort_by(&:last)
                   .map { |(_,s),_| s }

另一个功能性的:

hash.to_a.values_at(*hash.values.each_with_index
                         .sort.reverse
                         .map(&:last)
                         .sort.take(2))
         .map(&:first)

但请注意,作为无序数据结构,哈希表并不适合此用例(尽管在Ruby 1.9中记住了该顺序)。您应该使用数组(排序代码保持不变):

sentences = [
  ['This is the first sentence.',  5],
  ['This is the second sentence.', 1],
  ['This is the last sentence.',   6],
]

答案 2 :(得分:1)

hash = {
  'This is the first sentence.' => 5,
  'This is the second sentence.' => 1,
  'This is the last sentence.' => 6
}

cutoff_val = hash.values.sort[-2] #cf. sawa
p hash.select{|k,v| v >= cutoff_val } 
# =>{"This is the first sentence."=>5, "This is the last sentence."=>6}

答案 3 :(得分:0)

从Ruby 2.2.0开始,Enumerable#max_by采用一个可选的整数参数,使其返回一个数组,而不仅仅是一个元素。因此,我们可以做到:

hash = {
  'This is the first sentence.' => 6,
  'This is the second sentence.' => 1,
  'This is the last sentence.' => 5
 }

p hash.max_by(2, &:last).map(&:first).sort_by { |k| hash.keys.index k }
# => ["This is the first sentence.", "This is the last sentence."]

最后对sort_by的调用可以保证句子的顺序正确,如您所要求的那样。

答案 4 :(得分:-1)

a = hash.sort_by { |sentence, score| score }.reverse

数组a现在包含最高得分句子的值对。您可以选择前N个。

hash = {"foo" => 7, "bar" => 2, "blah" => 3 }
a = hash.sort_by { |sentence, score| score }.reverse
=> [["foo", 7], ["blah", 3], ["bar", 2]]