我有一组哈希数组。
items =
[{ "item_9": 152 }, { "item_2": 139 }, { "item_13": 138 }, { "item_72": 137 }, { "item_125": 140 }, { "item_10": 144 }]
[{ "item_9": 152 }, { "item_2": 139 }, { "item_13": 138 }, { "item_72": 137 }, { "item_125": 140 }, { "item_10": 146 }]
[{ "item_9": 152 }, { "item_2": 139 }, { "item_13": 138 }, { "item_72": 137 }, { "item_125": 140 }, { "item_10": 147 }]
[{ "item_9": 152 }, { "item_2": 139 }, { "item_13": 138 }, { "item_72": 137 }, { "item_125": 140 }, { "item_10": 148 }]
[{ "item_9": 152 }, { "item_2": 139 }, { "item_13": 138 }, { "item_72": 137 }, { "item_125": 140 }, { "item_10": 153 }]
.
.
.
[{ "item_9": 152 }, { "item_2": 145 }, { "item_13": 150 }, { "item_72": 154 }, { "item_125": 141 }, { "item_10": 144 }]
[{ "item_9": 152 }, { "item_2": 145 }, { "item_13": 150 }, { "item_72": 154 }, { "item_125": 141 }, { "item_10": 146 }]
[{ "item_9": 152 }, { "item_2": 145 }, { "item_13": 150 }, { "item_72": 154 }, { "item_125": 141 }, { "item_10": 147 }]
[{ "item_9": 152 }, { "item_2": 145 }, { "item_13": 150 }, { "item_72": 154 }, { "item_125": 141 }, { "item_10": 148 }]
[{ "item_9": 152 }, { "item_2": 145 }, { "item_13": 150 }, { "item_72": 154 }, { "item_125": 141 }, { "item_10": 153 }]
我想做的是将其改为哈希数组......
items =
{"item_9"=>152, "item_2"=>145, "item_13"=>150, "item_72"=>154, "item_125"=>141, "item_10"=>146}
{"item_9"=>152, "item_2"=>145, "item_13"=>150, "item_72"=>154, "item_125"=>141, "item_10"=>147}
{"item_9"=>152, "item_2"=>145, "item_13"=>150, "item_72"=>154, "item_125"=>141, "item_10"=>148}
{"item_9"=>152, "item_2"=>145, "item_13"=>150, "item_72"=>154, "item_125"=>141, "item_10"=>153}
我相信我可以使用......
items.map! { |item| item.reduce({}, :merge) }
然而,它并不是非常高效。当你有1.4亿条记录时,至少它的表现不够好。有更好的方法吗?
答案 0 :(得分:3)
也许有点长,但效果更快:
require 'benchmark'
items = [
[{ item_9: 152 }, { item_2: 139 }, { item_13: 138 }, { item_72: 137 }, { item_125: 140 }, { item_10: 146 }],
[{ item_9: 152 }, { item_2: 139 }, { item_13: 138 }, { item_72: 137 }, { item_125: 140 }, { item_10: 147 }],
[{ item_9: 152 }, { item_2: 139 }, { item_13: 138 }, { item_72: 137 }, { item_125: 140 }, { item_10: 148 }],
[{ item_9: 152 }, { item_2: 139 }, { item_13: 138 }, { item_72: 137 }, { item_125: 140 }, { item_10: 153 }],
[{ item_9: 152 }, { item_2: 145 }, { item_13: 150 }, { item_72: 154 }, { item_125: 141 }, { item_10: 144 }],
[{ item_9: 152 }, { item_2: 145 }, { item_13: 150 }, { item_72: 154 }, { item_125: 141 }, { item_10: 146 }],
[{ item_9: 152 }, { item_2: 145 }, { item_13: 150 }, { item_72: 154 }, { item_125: 141 }, { item_10: 147 }],
]
n = 100_000
Benchmark.bm do |b|
b.report do
n.times do |i|
items.map { |item| item.reduce({}, :merge) }
end
end
b.report do
n.times do |i|
# the winer
items.map { |item| item.reduce({}, :update) }
end
end
b.report do
n.times do |i|
items.map { |i| i.inject({}) { |f,c| f.update c } }
end
end
end
正如@tokland所说,item.reduce({}, :update)
更快:
user system total real
6.300000 0.080000 6.380000 ( 6.386180)
1.840000 0.020000 1.860000 ( 1.860073)
2.220000 0.020000 2.240000 ( 2.237294)
感谢@tokland
答案 1 :(得分:0)
由于性能存在问题,因此for
循环yield
可能是时候了,也可能会注意到有关您数据的有趣事实(如果有的话)。例如,您的数据似乎有许多重复项。这是一个规则还是巧合?
答案 2 :(得分:0)
如果你确定你有一个两级数组(对中没有其他数组),并且每对中只有两个项目,那么使用它会更快更短:
array = [['A', 'a'], ['B', 'b'], ['C', 'c']]
hash = Hash[*array.flatten]
对于两级以上的深度数组,这会产生错误的结果甚至错误(对于某些输入)。
array = [['A', 'a'], ['B', 'b'], ['C', ['a', 'b', 'c']]]
hash = Hash[*array.flatten]
# => {"A"=>"a", "B"=>"b", "C"=>"a", "b"=>"c"}
但是如果你运行的是Ruby 1.8.7或更高版本,你可以将一个参数传递给Array#flatten并让它只变平一层:
# on Ruby 1.8.7+
hash = Hash[*array.flatten(1)]
# => {"A"=>"a", "B"=>"b", "C"=>["a", "b", "c"]}