我有一个包含固定数量类别的大型数据集。我最初将所有东西存储在一系列哈希中。效果很好,但考虑到数据的大小和类别的冗余,它效率不高。
我现在使用不同类型/类别的哈希,并在每个类别中存储哈希数组。
现在我添加数据的当前方法是在将每个哈希添加到类型数组之前删除每个哈希的:type
键。一切正常。不过,我相信有一种更精简的Ruby方式'这样做:
# Very large data set with redundant types.
gigantic_array = [
{ type: 'a', organization: 'acme inc', president: 'bugs bunny' },
{ type: 'a', organization: 'looney toons', president: 'donald' },
{ type: 'b', organization: 'facebook', president: 'mark' },
{ type: 'b', organization: 'myspace', president: 'whoknows' },
{ type: 'c', organization: 'walmart', president: 'wall' }
# multiply length by ~1000
]
# Still gigantic, but more efficient.
# Stores each type as symbol.
# Each type is an array of hashes.
more_efficient_hash = {
type: {
a: [
{ organization: 'acme inc', president: 'bugs bunny' },
{ organization: 'looney toons', president: 'donald' }
],
b: [
{ organization: 'facebook', president: 'mark' },
{ organization: 'myspace', president: 'whoknows' }
],
c: [
{ organization: 'walmart', president: 'wall' }
# etc....
]
}
}
hash_to_add = { type: 'c', organization: 'target', president: 'sharp' }
# Adds hash to array of types inside the gigantic more_efficient_hash.
# Is there a better way?
more_efficient_hash[:type][hash_to_add[:type].to_sym].push(hash_to_add.delete(:type))
答案 0 :(得分:1)
我同意undur_gongor一些小数据类会有所帮助,而且结果中的:type
键也不会添加任何值。
对于gigantic_array
的初始转化,您可以使用group_by
轻松完成转换。请注意Hash#delete
会返回已删除密钥的值,而不是哈希值,所以我不相信您的最后一行是按照您想要的方式工作。
> more_efficient_hash = gigantic_array.group_by {|item| item.delete(:type).to_sym}
{
a: [
{:organization=>"acme inc", :president=>"bugs bunny"},
{:organization=>"looney toons", :president=>"donald"}
],
b: [
{:organization=>"facebook", :president=>"mark"},
{:organization=>"myspace", :president=>"whoknows"}
],
c: [
{:organization=>"walmart", :president=>"wall"}
]
}
从那时起,你的最后一行很干净。因为delete
具有破坏性,我们可以稍微缩短它。
> more_efficient_hash[hash_to_add.delete(:type).to_sym] << hash_to_add
# ...
c: [
{:organization=>"walmart", :president=>"wall"},
{:organization=>"target", :president=>"sharp"}
]