我有一系列哈希:
data = [{"user_id"=>1, "answer"=>"cupcakes"},
{"user_id"=>1, "answer"=>"Colorado"},
{"user_id"=>1, "answer"=>"newspaper"},
{"user_id"=>2, "answer"=>"fruitcake"},
{"user_id"=>2, "answer"=>"Louisiana"},
{"user_id"=>2, "answer"=>"tv"}]
如何重新组织它以便按"user_id"
分组并在一个哈希中列出所有"answer"
?类似的东西:
output_data = [{"user_id" => 1, "answer1"=>"cupcakes", "answer2"=>"Colorado", "answer3"=>"newspaper"},
{"user_id" => 2, "answer1"=>"fruitcake", "answer2"=>"Louisiana", "answer3"=>"tv"}]
或者可能在数组中包含所有答案:
output_data = [{"user_id" => 1, "answers"=>["cupcakes", "Colorado", "newspaper"]},
{"user_id" => 2, "answers"=>["fruitcake", "Louisiana", "tv"]}]
我与此特定输出无关。我需要将"user_id"
作为关键,并将所有答案组织在一起。有什么建议吗?
答案 0 :(得分:5)
你可以这样做:
<强>代码强>
def convert(arr)
arr.each_with_object({}) do |g,h|
h.update(g["user_id"]=>[g["answer"]]) { |_,o,n| o+n }
end.map { |k,v| { "user_id"=>k, "answer"=>v } }
end
示例强>
convert(data)
#=> [{"user_id"=>1, "answer"=>["cupcakes", "Colorado", "newspaper"]},
# {"user_id"=>2, "answer"=>["fruitcake", "Louisiana", "tv"]}]
<强>解释强>
我们有:
enum = data.each_with_object(Hash.new { |h,k| h[k] = [] })
#=> #<Enumerator: [{"user_id"=>1, "answer"=>"cupcakes"},
# {"user_id"=>1, "answer"=>"Colorado"},
# {"user_id"=>1, "answer"=>"newspaper"},
# {"user_id"=>2, "answer"=>"fruitcake"},
# {"user_id"=>2, "answer"=>"Louisiana"},
# {"user_id"=>2, "answer"=>"tv"}]:
# each_with_object({})>
我们可以将枚举器转换为数组,以查看将传递给块的值:
a = enum.to_a
#=> [[{"user_id"=>1, "answer"=>"cupcakes"}, {}],
# [{"user_id"=>1, "answer"=>"Colorado"}, {}],
# [{"user_id"=>1, "answer"=>"newspaper"}, {}],
# [{"user_id"=>2, "answer"=>"fruitcake"}, {}],
# [{"user_id"=>2, "answer"=>"Louisiana"}, {}],
# [{"user_id"=>2, "answer"=>"tv"}, {}]]
如您所见,枚举器包含六个元素,每个元素都包含一个由data
元素组成的双元素数组和一个最初为空的哈希。
关键是我正在使用Hash#update(又名merge!
)的形式,当两个哈希值合并时,使用一个块来确定键的值。< / p>
enum
的第一个元素被传递给块并分配给块变量,如下所示:
g, h = enum.next
#=> [{"user_id"=>1, "answer"=>"cupcakes"}, {}]
g #=> {"user_id"=>1, "answer"=>"cupcakes"}
h #=> {}
因此,块计算是:
h.update(g["user_id"]=>[g["answer"]])
# {}.update(1=>["cupcakes"])
#=> {1=>["cupcakes"]}
h #=> {1=>["cupcakes"]}
update
的块未用于此第一次合并操作,因为(合并之前)h
没有键1
。在稍后的操作中再次g["user_id"] #=> 1
。此时,该块将用于确定键1
的值。
这导致:
h = data.each_with_object({}) do |g,h|
h.update(g["user_id"]=>[g["answer"]]) { |_,o,n| o+n }
end
#=> { 1=>["cupcakes", "Colorado", "newspaper"],
# 2=>["fruitcake", "Louisiana", "tv"] }
将h
的键元素对映射到所需的哈希数组是一件简单的事情。
<强>替代强>
通过合并哈希来实现此目的的另一种方法如下:
data.each_with_object(Hash.new { |h,k| h[k]=[] }) do |g,h|
h[g["user_id"]] << g["answer"]
end.map { |k,v| { "user_id"=>k, "answer"=>v } }
#=> [{"user_id"=>1, "answer"=>["cupcakes", "Colorado", "newspaper"]},
# {"user_id"=>2, "answer"=>["fruitcake", "Louisiana", "tv"]}]
当h[k]
没有键h
时要修改k
时,这会为哈希提供一个空数组的默认值。例如:
h = Hash.new { |h,k| h[k]=[] }
#=> {}
h[:cat] << 'boots'
#=> ["boots"]
h #=> {:cat=>["boots"]}
答案 1 :(得分:5)
您的预期结果没有意义。要保留"answer"
信息,您需要将它们保存为数组。
data.group_by{|h| h["user_id"]}.each{|_, v| v.map!{|h| h["answer"]}}
# =>
# {
# 1=>["cupcakes", "Colorado", "newspaper"],
# 2=>["fruitcake", "Louisiana", "tv"]
# }
"user_id"
和"answer"
这样的字符串是多余的,您应该避免它们存在于数据中,除非它有助于以任何方式使它们清晰。