如何比较包含约25000个哈希值的两个哈希值?

时间:2014-10-06 17:01:46

标签: ruby hash compare

我有两个包含多个哈希值的哈希值(产品信息)。

我想要做的是比较两个哈希值,看看哪些产品已添加,删除,更新(例如价格,描述,图片)。

old_hash.size
# => 24595

new_hash.size
# => 26153

这里有两个哈希的结构:

{"wi230075"=>
  {"itemId"=>"wi230075",
   "description"=>"AH Verse frietaardappelen",
   "salesUnitSize"=>"2,5 kg",
   "images"=>[...]
   "fromPrice"=>2.19,
   "basePrice"=>{"price"=>2.19, "unitPriceDescription"=>"0.96/KG"},
   "score"=>0,
   "frozen"=>false,
   "isPBO"=>false,
   "outOfStock"=>false,
   "quantity"=>0,
   "extendedAttributes"=>[],
   "sourceId"=>{"source"=>"wi", "id"=>230075, "asString"=>"wi230075"},
   "hqIdSource"=>"AH_HQ",
   "hqId"=>822729,
   "productId"=>230075,
   "links"=>[],
   "category"=>"/Aardappel, groente, fruit/Aardappelen/Hele aardappel/",
   "brand"=>"AH"},
  {...}
}

我尝试使用HashDiff gem比较两个哈希值。这就是我得到的:

diff = HashDiff.diff(old_hash, new_hash)
diff.size
# => 64378

似乎出现了一些错误,无法进行64378更改。

比较两个哈希值的更好方法是什么?

修改

我想知道某个产品是否已添加,删除或编辑过。如果确实如此,那么简单的true就足够了。

2 个答案:

答案 0 :(得分:2)

这将返回所有已更改的密钥(即创建,删除或更新):

(old_hash.keys | new_hash.keys).select { |k| old_hash[k] != new_hash[k] }

要获得具体信息,您可以执行以下操作:

keys = (old_hash.keys | new_hash.keys)
new_keys = keys.select { |k| old_hash[k].nil? }
deleted_keys = keys.select { |k| new_hash[k].nil? }
modified_keys = keys.select { |k| old_hash[k] != new_hash[k] }
unchanged_keys = keys - (new_keys | deleted_keys | modified_keys)

这假设您对具有nil值的键不感兴趣。如果你那么你应该明显替换.nil?用其他东西打电话。

答案 1 :(得分:1)

我还没有测试代码,但我觉得它看起来像这样

获取添加的记录:

added_keys = new_hash.keys - old_hash.keys
added_records = new_hash.select{|k,v| added_keys.include? k}

要删除记录:

removed_keys = old_hash.keys - new_hash.keys
removed_records = old_hash.select{|k,v| removed_keys.include? k}

获取更改的记录:

changed_records = new_hash.select do |k,v|
  old_hash.has_key?(k) && (old_hash[k]["description"] != new_hash[k]["description"] || old_hash[k]["images"] != new_hash[k]["images"] || old_hash[k]["basePrice"] != new_hash[k]["basePrice"] )
end