Question

[1, 1, 1, 2, 3].mode
=> 1

['cat', 'dog', 'snake', 'dog'].mode
=> dog

Answer 1

首先构建哈希，将数组中的每个值映射到其频率......

arr = [1, 1, 1, 2, 3]

freq = arr.inject(Hash.new(0)) { |h,v| h[v] += 1; h }
#=> {1=>3, 2=>1, 3=>1}

...然后使用频率表查找频率最高的元素：

arr.max_by { |v| freq[v] }
#=> 1

Answer 2

虽然我喜欢grep解决方案的优雅，并提醒（或教导）我关于Enumerable中我忘记（或完全忽略）的方法，但它是缓慢，缓慢，缓慢的。我同意100％创建Array#mode方法是一个好主意，但是 - 这是Ruby，我们不需要一个作用于数组的函数库，我们可以创建一个mixin来添加必要的函数进入 Array类本身。

但是inject（Hash）替代方法使用了一种我们也不需要的排序：我们只想要具有最高出现率的值。

这两种解决方案都没有解决多个值可能是模式的可能性。也许这不是问题中的问题（无法说明）。我想我想知道是否存在平局，无论如何，我认为我们可以在性能上有所改善。

require 'benchmark'

class Array
  def mode1
    sort_by {|i| grep(i).length }.last
  end
  def mode2
    freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h }
    sort_by { |v| freq[v] }.last    
  end
  def mode3
    freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h }
    max = freq.values.max                   # we're only interested in the key(s) with the highest frequency
    freq.select { |k, f| f == max }         # extract the keys that have the max frequency
  end
end

arr = Array.new(1_000) { |i| rand(100) }    # something to test with

Benchmark.bm(30) do |r|
  res = {}
  (1..3).each do |i|
    m = "mode#{i}"
    r.report(m) do
      100.times do
        res[m] = arr.send(m).inspect
      end
    end
  end
  res.each { |k, v| puts "%10s = %s" % [k, v] }
end

这是样本运行的输出。

                                user     system      total        real
mode1                          34.375000   0.000000  34.375000 ( 34.393000)
mode2                           0.359000   0.000000   0.359000 (  0.359000)
mode3                           0.219000   0.000000   0.219000 (  0.219000)
     mode1 = 41
     mode2 = 41
     mode3 = [[41, 17], [80, 17], [72, 17]]

“优化”模式3占用了前一个记录持有者的60％的时间。另请注意多个最高频率条目。

修改

几个月后，我注意到了Nilesh's answer，提出了这个问题：

def mode4
  group_by{|i| i}.max{|x,y| x[1].length <=> y[1].length}[0]
end

它不适用于1.8.6开箱即用，因为该版本没有Array＃group_by。对于Rails开发人员来说，ActiveSupport有它，虽然看起来比上面的mode3慢了2-3％。然而，使用（优秀的）backports宝石会产生10-12％的增益，并且可以提供1.8.7和1.9个附加物。

以上仅适用于 1.8.6 - 主要仅适用于安装在Windows上的情况。自从我安装了它，这就是你从IronRuby 1.0（在.NET 4.0上）得到的：

==========================   IronRuby   =====================================
(iterations bumped to **1000**)    user     system      total        real
mode1 (I didn't bother :-))
mode2                           4.265625   0.046875   4.312500 (  4.203151)
mode3                           0.828125   0.000000   0.828125 (  0.781255)
mode4                           1.203125   0.000000   1.203125 (  1.062507)

因此，如果性能超级关键，请对Ruby版本和版本的选项进行基准测试。 OS。 YMMV

Answer 3

array.max_by { |i| array.count(i) }

Answer 4

迈克：我发现了一种更快的方法。试试这个：

  class Array
    def mode4
      group_by{|i| i}.max{|x,y| x[1].length <=> y[1].length}[0]
    end
  end

基准输出：

                                    user     system      total        real
mode1                          24.340000   0.070000  24.410000 ( 24.526991)
mode2                           0.200000   0.000000   0.200000 (  0.195348)
mode3                           0.120000   0.000000   0.120000 (  0.118200)
mode4                           0.050000   0.010000   0.060000 (  0.056315)
     mode1 = 76
     mode2 = 76
     mode3 = [[76, 18]]
     mode4 = 76

Answer 5

arr = [ 1, 3, 44, 3 ]
most_frequent_item = arr.uniq.max_by{ |i| arr.count( i ) }
puts most_frequent_item
#=> 3

甚至无需考虑频率映射。

Answer 6

这是这个问题的副本： Ruby - Unique elements in Array

这是问题的解决方案：

group_by { |n| n }.values.max_by(&:size).first

该版本似乎比Nilesh C的答案更快。以下是我用来对其进行基准测试的代码（OS X 10.6 Core 2 2.4GHz MB）。

向Mike Woodhouse致敬（原始）基准代码：

class Array
   def mode1
     group_by { |n| n }.values.max_by(&:size).first
   end
   def mode2
     freq = inject(Hash.new(0)) { |h,v| h[v] += 1; h }
     max = freq.values.max                   # we're only interested in the key(s) with the highest frequency
     freq.select { |k, f| f == max }         # extract the keys that have the max frequency
   end
end

arr = Array.new(1_0000) { |i| rand(100000) }    # something to test with

Benchmark.bm(30) do |r|
    (1..2).each do |i| r.report("mode#{i}") { 100.times do arr.send("mode#{i}").inspect; end }; end
end

以下是基准测试的结果：

                                user     system      total        real
mode1                           1.830000   0.010000   1.840000 (  1.876642)
mode2                           2.280000   0.010000   2.290000 (  2.382117)
 mode1 = 70099
 mode2 = [[70099, 3], [70102, 3], [51694, 3], [49685, 3], [38410, 3], [90815, 3], [30551, 3], [34720, 3], [58373, 3]]

正如你所看到的，这个版本的速度提高了约20％，但忽略了关系。我也喜欢简洁，我个人原样使用它，没有猴子修补到处。：）

Answer 7

如果你试图避免学习#inject（你不应该这样做......）

words = ['cat', 'dog', 'snake', 'dog']
count = Hash.new(0)

words.each {|word| count[word] += 1}
count.sort_by { |k,v| v }.last

但如果我之前读过这个答案，现在我对#inject和man一无所知，你需要了解#inject。

Answer 8

idx = {}
[2,2,1,3,1].each { |i| idx.include?(i) ? idx[i] += 1 : idx[i] = 1}

这只是一个简单的索引器。你可以用任何基于符号/字符串的标识符替换[2,2,1 ..]数组，这对于对象不起作用，你需要引入更多的复杂性，但这很简单。 / p>

重读你的问题，这个解决方案有点过度设计，因为它会返回所有事件的索引，而不仅仅是最有效的索引。

Answer 9

这是另一个版本，它可以为您提供一种模式：

def mode
  group_by {|x| x}.group_by {|k,v| v.size}.sort.last.last.map(&:first)
end

换句话说，对值进行分组，然后将这些kv对按值的数量进行分组，然后对那些 kv对进行排序，取最后一个（最高）大小组，然后展开其值。我喜欢group_by。

Answer 10

Ruby版本> = 2.7将具有Enumerable#tally

统计集合。返回哈希值，其中键是元素值是集合中元素的数量对应于密钥。

所以，你可以做

[1, 1, 1, 2, 3].tally
# => {1=>3, 2=>1, 3=>1}

Answer 11

def mode(array)

    count = []  # Number of times element is repeated in array
    output = [] 
    array.compact!
    unique = array.uniq
    j=0

    unique.each do |i|
        count[j] = array.count(i)
        j+=1
    end
    k=0
    count.each do |i|
        output[k] = unique[k] if i == count.max
        k+=1
    end  

    return output.compact.inspect
end

p mode([3,3,4,5]) #=> [3]

p mode([1,2,3]) #=> [1,2,3]

p mode([0,0,0,0,0,1,2,3,3,3,3,3]) #=> [0,3]

p mode([-1,-1,nil,nil,nil,0]) #=> [-1]

p mode([-2,-2,3,4,5,6,7,8,9,10,1000]) #=> [-2]

Ruby：如何查找出现次数最多的数组中的项？

11 个答案: