合并散列数组:value应该是合并值的平均值

时间:2018-05-18 12:12:11

标签: ruby algorithm

问题:合并具有相同指定键值的哈希数组,并找到其他键的平均值。

我的解决方案似乎很难看

数据:

require 'pp'

arr = [{:red=>346.0,
  :unu=>10.0,
  :used=>20147.0,
  :acc_id=>550,
  :percent=>0.01},
 {:red=>0.0,
  :unu=>1.0,
  :used=>66.0,
  :acc_id=>550,
  :percent=>0.06},
 {:red=>120.0,
  :unu=>11.0,
  :used=>166.0,
  :acc_id=>550,
  :percent=>10.06},
 {:red=>1306.0,
  :unu=>1.0,
  :used=>13259.0,
  :acc_id=>9999,
  :percent=>0.0}]

在当前示例中,我们应该将3个哈希值与(:acc_id = 550)合并,结果数组应该包含两个哈希值(合并哈希值为:acc_id = 550,未触摸哈希值为:acc_id = 9999)

算法:

data = []
arr.group_by{|h| h[:acc_id] }.map {|_, arr_of_hashes|
  sz = arr_of_hashes.size
  if sz > 1
    arr_of_hashes = arr_of_hashes.inject{|memo, el|
      memo.merge(el) {|k, old_v, new_v| old_v + new_v}
    }

    arr_of_hashes.map {|k, v| arr_of_hashes[k] = v / sz}
  end
  data << arr_of_hashes if arr_of_hashes.is_a? Hash
  data << arr_of_hashes[0] if arr_of_hashes.is_a? Array
}

pp data

预期结果: 合并哈希数组

[{:red=>155.33333333333334,
  :unu=>7.333333333333333,
  :used=>6793.0,
  :acc_id=>550,
  :percent=>3.376666666666667},
 {:red=>1306.0,
  :unu=>1.0,
  :used=>13259.0,
  :acc_id=>9999,
  :percent=>0.0}]

... ... ...

2 个答案:

答案 0 :(得分:0)

我在/上发现了您执行+acc_id的一个错误。重构了代码 让我们尝试一下,但我想我们仍然会改进这一点。

data = []
arr.group_by{|h| h[:acc_id] }.map {|_, arr_of_hashes|
  sz = arr_of_hashes.size
  result = Hash.new(0)
  arr_of_hashes.map{ |hash| hash.map{ |k,v| result[k] += v/sz unless k == :acc_id } }
  result[:acc_id] = arr_of_hashes.first[:acc_id]
  data << result
}

答案 1 :(得分:0)

我建议您按如下方式计算平均值。我假设,在示例中,所有哈希都具有相同的键(尽管它们不一定必须以相同的顺序出现)。

<强>代码

def doit(arr, key)
  keys = arr.first.keys
  arr.group_by { |g| g[key] }.
      map do |_,a| 
        averages = a.map { |h| h.values_at(*keys) }.
                     transpose.
                     map { |v| v.sum.fdiv(v.size) }
        keys.zip(averages).to_h
      end
end 

示例

请注意,我的哈希数组与问题中的示例中给出的有些不同。具体来说,有三个(而不是两个)哈希值组,键:acc_id的值具有共同值。

arr = [{ red: 346.0,  unu: 10.0, used: 20147.0, acc_id: 550,  percent: 0.01 },
       { red: 0.0,    unu: 1.0,  used: 66.0,    acc_id: 550,  percent: 0.06 },
       { red: 120.0,  unu: 11.0, used: 166.0,   acc_id: 10,   percent: 10.06 },
       { red: 100.0,  unu: 19.0, used: 170.0,   acc_id: 10,   percent: 11.56 },
       { red: 1306.0, unu: 1.0,  used: 13259.0, acc_id: 9999, percent: 0.0 }]

doit(arr, :acc_id)
  #=> [{:red=>173.0, :unu=>5.5, :used=>10106.5, :acc_id=>550.0, :percent=>0.035},
  #    {:red=>110.0, :unu=>15.0, :used=>168.0, :acc_id=>10.0, :percent=>10.81},
  #    {:red=>1306.0, :unu=>1.0, :used=>13259.0, :acc_id=>9999.0, :percent=>0.0}]

<强>解释

Enumerable#group_byArray#sum(后者已在v2.4中首次亮相)。

步骤如下。

key = :acc_id
keys = arr.first.keys
  #=> [:red, :unu, :used, :acc_id, :percent]
b = arr.group_by { |g| g[key] }
  #=> { 550=>[{:red=>346.0, :unu=>10.0, :used=>20147.0, :acc_id=>550, :percent=>0.01},
  #           {:red=>0.0, :unu=>1.0, :used=>66.0, :acc_id=>550, :percent=>0.06}],
  #      10=>[{:red=>120.0, :unu=>11.0, :used=>166.0, :acc_id=>10, :percent=>10.06},
  #           {:red=>100.0, :unu=>19.0, :used=>170.0, :acc_id=>10, :percent=>11.56}],
  #    9999=>[{:red=>1306.0, :unu=>1.0, :used=>13259.0, :acc_id=>9999, :percent=>0.0}]}
b.map do |_,a| 
  averages = a.map { |h| h.values_at(*keys) }.
               transpose.
               map { |v| v.sum.fdiv(v.size) }
  keys.zip(averages).to_h
end 
  #=> <the array of hashes shown above>