Question

我有一个类似于以下内容的数据集：

[
  {:option_id => 10, :option_style_ids => [9, 10, 11]},
  {:option_id => 7, :option_style_ids => [19]},
  {:option_id => 8, :option_style_ids => [1]},
  {:option_id => 5, :option_style_ids => [4, 5]},
  {:option_id => 10, :option_style_ids => [9, 10, 11]},
  {:option_id => 7, :option_style_ids => [19]},
  {:option_id => 5, :option_style_ids => [4, 5]},
  {:option_id => 8, :option_style_ids => [1]},
  {:option_id => 12, :option_style_ids => [20]},
  {:option_id => 5, :option_style_ids => [2, 5]}
]

我想将数据集合并为输出：

[
  {:option_id => 10, :option_style_ids => [9, 10, 11]},
  {:option_id => 7, :option_style_ids => [19]},
  {:option_id => 8, :option_style_ids => [1]},
  {:option_id => 5, :option_style_ids => [2, 4, 5]},
  {:option_id => 12, :option_style_ids => [20]}
]

上面的输出剥离了重复项，但是，对于option_id: 5哈希，我需要它来组合option_style_ids数组值（其中一些是不同的）。

我试过了：

r.group_by{|h| h[:option_id]}.map{|k,v| v.reduce(:merge)}

不幸的是，这并未合并option_style_ids数组值。

Answer 1

这可以使用Hash#update（又名Hash#merge!）来完成，使用使用块的形式来确定两个哈希中存在的键的值是否合并。

<强>代码

def merge_em(arr)
  arr.each_with_object({}) do |g,h|
    h.update(g[:option_id]=>g) do |_,o,n|
      { :option_id=>o[:option_id],
        :option_style_ids=>o[:option_style_ids] | n[:option_style_ids] }
    end
  end.values
end

示例

对于问题中给出的数组，我将其称为arr：

merge_em(arr) #=> [{:option_id=>10, :option_style_ids=>[9, 10, 11]}, # {:option_id=> 7, :option_style_ids=>[19]}, # {:option_id=> 8, :option_style_ids=>[1]}, # {:option_id=> 5, :option_style_ids=>[4, 5, 2]}, # {:option_id=>12, :option_style_ids=>[20]}]

<强>解释

为了解释发生了什么，让我简化arr：

arr = [ { :option_id => 10, :option_style_ids => [9, 10, 11] }, { :option_id => 7, :option_style_ids => [19] }, { :option_id => 10, :option_style_ids => [9, 12] } ]

步骤：

enum = arr.each_with_object({}) #=> #<Enumerator: [ # {:option_id=>10, :option_style_ids=>[9, 10, 11]}, # {:option_id=> 7, :option_style_ids=>[19]}, # {:option_id=>10, :option_style_ids=>[9, 12]} # ]:each_with_object({})>

我们可以通过将enum转换为数组来查看enum.to_a #=> [[{:option_id=>10, :option_style_ids=>[9, 10, 11]}, {}], # [{:option_id=> 7, :option_style_ids=>[19]}, {}], # [{:option_id=>10, :option_style_ids=>[9, 12]}, {}]]的元素：

enum

如您所见，enum包含三个元素。

g,h = enum.next #=> [{:option_id=>10, :option_style_ids=>[9, 10, 11]}, {}] g #=> {:option_id=>10, :option_style_ids=>[9, 10, 11]} h #=> {}的第一个元素被传递给块并分配给块变量：

h.update(g[:option_id]=>g) #=> {}.update(10=>{:option_id=>10, :option_style_ids=>[9, 10, 11]} # {10=>{:option_id=>10, :option_style_ids=>[9, 10, 11]}}

我们现在执行块计算：

update

h会返回{ 10=>g }的新值。

将h合并到(10=>g)（Ruby允许使用速记h）时，10没有键update，所以{{1在确定h[10]的合并值时，不会参考块。

enum的下一个元素将传递给块：

g,h = enum.next #=> [{:option_id=>7, :option_style_ids=>[19]}, #=> {10=>{:option_id=>10, :option_style_ids=>[9, 10, 11]}}] g #=> {:option_id=>7, :option_style_ids=>[19]} h #=> {10=>{:option_id=>10, :option_style_ids=>[9, 10, 11]}}

请注意h已更新。

我们现在执行块计算：

h.update(g[:option_id]=>g) #=> {10=>{:option_id=>10, :option_style_ids=>[9, 10, 11]}} # .update(7=>{:option_id=>7, :option_style_ids=>[19]}) #=> {10=>{:option_id=>10, :option_style_ids=>[9, 10, 11]}, # 7=>{:option_id=> 7, :option_style_ids=>[19]}}

同样，h没有密钥7，因此未使用update块。

enum的最后一个元素现在传递给块并执行块计算：

g,h = enum.next g #=> {:option_id=>10, :option_style_ids=>[9, 12]} h #=> {10=>{:option_id=>10, :option_style_ids=>[9, 10, 11]}, # 7=>{:option_id=> 7, :option_style_ids=>[19]}} h.update(10=>g)

此时h包含要合并到10（h）的哈希的密钥（{ 10=>g }）。因此，调用update块以确定合并散列中该键的值。该块传递一个由三个元素组成的数组：

k,o,n = [10, h[10], g] #=> [10, {:option_id=>10, :option_style_ids=>[9, 10, 11]}, # {:option_id=>10, :option_style_ids=>[9, 12]}] k #=> 10 o #=> {:option_id=>10, :option_style_ids=>[9, 10, 11]} n #=> {:option_id=>10, :option_style_ids=>[9, 12]}

我们希望该块返回：

{:option_id=>10, :option_style_ids=>[9, 10, 11, 12]}

我们最容易做到这一点：

{ :option_id=>o[:option_id], :option_style_ids=>o[:option_style_ids] | n[:option_style_ids] } #=> { :option_id=>10, # :option_style_ids=>[9, 10, 11] | [9, 12] #=> { :option_id=>10, :option_style_ids=>[9, 10, 11, 12]}

h[10]设置为此值，现在为：

h #=> {10=>{:option_id=>10, :option_style_ids=>[9, 10, 11, 12]}, # 7=>{:option_id=>7, :option_style_ids=>[19]}}

，因为我们已完成枚举enum，因此each_with_object返回的值。最后一步是提取此哈希的值：

h.values #=> [{:option_id=>10, :option_style_ids=>[9, 10, 11, 12]}, # {:option_id=> 7, :option_style_ids=>[19]}]

Answer 2

这不完全是你想要的输出，但你可以从这里转换它。

require 'set'
src = [...your original array of hashes...]
style_ids_by_option_id = {}
src.each do |e|
  style_ids_by_option_id[e[:option_id]] ||= Set.new
  style_ids_by_option_id[e[:option_id]].merge(e[:option_style_ids])
end

这会产生如下数据结构：

{10=>#<Set: {9, 10, 11}>,
 7=>#<Set: {19}>,
 8=>#<Set: {1}>,
 5=>#<Set: {4, 5, 2}>,
 12=>#<Set: {20}>}

Answer 3

array.group_by{|h| h[:option_id]}.values.map do
  |a| a.inject{|h, _h| h.merge(_h){|k, v, _v| k == :option_id ? v : (v + _v).uniq}}
end
# => [
#   {:option_id=>10, :option_style_ids=>[9, 10, 11]},
#   {:option_id=>7, :option_style_ids=>[19]},
#   {:option_id=>8, :option_style_ids=>[1]},
#   {:option_id=>5, :option_style_ids=>[4, 5, 2]},
#   {:option_id=>12, :option_style_ids=>[20]}
# ]

如何使用嵌套数组合并哈希数组

3 个答案: