Question

我有这个数组：

array = ["1", "2", "3", "4"]

我有这个哈希数组：

ah = [
 {:id=>"1", :value=>"A"},
 {:id=>"2", :value=>"B"},
 {:id=>"3", :value=>"C"},
 {:id=>"4", :value=>"D"},
 {:id=>"5", :value=>"E"},
 {:id=>"6", :value=>"F"},
 {:id=>"7", :value=>"G"},
 {:id=>"8", :value=>"H"},
     ]

我需要拒绝身份不在ah的{{1}}中的任何哈希值。

实现这一目标的最佳方式是什么？

Answer 1

您可以使用以下代码选择反向 - 其id为array的哈希值：

ah.select{|el| array.include?(el[:id])}

如果您更喜欢reject，可以使用：

ah.reject{|el| !array.include?(el[:id])}

了解详情：Array#reject，Array#select。如果要使用Array#reject!或Array#select!进行适当修改，这些方法会创建一个新数组。

Answer 2

对于大块数据，我会进行一些预处理，以避免O(n*m)次查找。

array = ["1", "2", "3", "4"]
array_hash = array.each_with_object({}){ |i, h| h[i] = true }
ah.select{ |obj| array_hash[obj[:id]] }

Answer 3

我意识到已经有一个已接受的答案，但由于这里的所有答案都在O(n*m)，我以为我会在O（n）*中提出一个替代方案。

如果ah数组包含100_000项且子数组中有10_000项，则这是一个粗略的基准。我在这里包括fl00r的答案和Cary的答案，因为我们都试图避免O(n*m)场景。

                                user     system      total        real
select with include        34.610000   0.110000  34.720000 ( 34.924679)
reject with include        34.320000   0.100000  34.420000 ( 34.611992)
group and select            0.170000   0.010000   0.180000 (  0.182358)
select by value             0.040000   0.000000   0.040000 (  0.041073)
select with set             0.040000   0.000000   0.040000 (  0.048331)
hashify then values         0.130000   0.010000   0.140000 (  0.139686)

重现此代码的代码：

require 'benchmark'
require 'set'

list_size = 100_000
sub_list_size = 10_000

ah = Array.new(list_size) { |i| { id: i, value: "A" } }

array = []
sub_list_size.times { array << (0..list_size).to_a.sample }

def group_than_select(ah, array)
  grouped = ah.group_by { |x| x[:id] }

  good_keys = grouped.keys - array
  good_keys.map { |i| grouped[i] }.flatten
end

def select_by_fl00r(ah, array)
  array_hash = array.each_with_object({}){ |i, h| h[i] = true }
  ah.select{ |obj| array_hash[obj[:id]] }
end

def select_with_set(ah, array)
  array_to_set = array.to_set
  ah.select { |h| array_to_set.include?(h[:id]) }
end

def hashify_then_values_at(ah, array)
  h = ah.each_with_object({}) { |g,h| h.update(g[:id]=>g) }
  h.values_at(*(h.keys & array))
end

Benchmark.bm(25) do |x|
  x.report("select with include") do
    ah.select{|el| array.include?(el[:id])}
  end
  x.report("reject with include") do
    ah.reject{|e| !array.include?(e[:id])}
  end
  x.report("group and select") do
    group_than_select(ah, array)
  end
  x.report("select by value") do
    select_by_fl00r(ah, array)
  end
  x.report("select with set") do
    select_with_set(ah, array)
  end
  x.report("hashify then values") do
    hashify_then_values_at(ah, array)
  end
end

哈希映射通常是O（1）搜索，但O（n）最坏的情况是可能的。

Answer 4

比拒绝那些不在数组中的id更好的解决方案是只接受那些执行的操作：

ah.select { |hash| array.include?(hash[:id]) }

Answer 5

以下是两种可能性。

array = ["1", "2", "3", "4", "99999999"]

<强>＃1

如果首先将include?转换为集合，我希望array解决方案会快得多：

require 'set'

def select_with_set(ah, array) 
  array_to_set = array.to_set
  ah.select { |h| array_to_set.include?(h[:id]) }
end

select_with_set(ah, array) 
  #=> [{:id=>"1", :value=>"A"}, {:id=>"2", :value=>"B"},
  #    {:id=>"3", :value=>"C"}, {:id=>"4", :value=>"D"}]

<强>＃2

如果在示例中，ah的哈希元素具有:id的不同值，则可以执行此操作：

def hashify_then_values_at(ah, array)    
  h = ah.each_with_object({}) { |g,h| h.update(g[:id]=>g) }
  h.values_at(*(h.keys & array))
end

hashify_then_values_at(ah, array)
  #=> [{:id=>"1", :value=>"A"}, {:id=>"2", :value=>"B"},
  #    {:id=>"3", :value=>"C"}, {:id=>"4", :value=>"D"}]

如果它们不在数组中，则拒绝哈希内容

5 个答案: