Question

有没有一种简单的方法可以找到二维数组的交集？例如：

arr1 = [1,2,3,4,5]
arr2 = [5,6,7,8]
arr3 = [5]
bigarr = [arr1,arr1,arr3]

我知道可以这样做：

intersection = arr1 & arr2 & arr3 # => 5
intersection = big_arr[0] & big_arr[1] & big_arr[2] # => 5

但big_arr中的元素数量会有所不同。我想知道是否有一种简单的方法来交叉big_arr中的所有元素而不管元素的数量。

Answer 1

使用#reduce之类的

arr1 = [1,2,3,4,5]
arr2 = [5,6,7,8]
arr3 = [5]
bigarr = [arr1,arr2,arr3]
bigarr.reduce(:&) # => [5]

Answer 2

你想要什么：一个漂亮的脸或一个首先完成线的方法？我的朋友@Arup提供了一个;我会提供另一个。

<强>代码

def heavy_lifter(a)
  wee_one = a.min_by(&:size)
  return [] if wee_one.empty?
  wee_loc = a.index(wee_one)
  counts = wee_one.each_with_object({}) { |e,h| h.update(e=>1) }
  nbr_reqd = 1
  a.each_with_index do |b,i|
    next if i == wee_loc
    b.each do |e|
      cnt = counts[e]
      case
      when cnt.nil?
        next
      when cnt == nbr_reqd
        counts[e] = cnt + 1
      when cnt < nbr_reqd
        counts.delete(e)
        return [] if counts.empty?
      end
    end
    nbr_reqd += 1
  end
  counts.keys.each { |k| counts.delete(k) if counts[k] < nbr_reqd }
  counts.keys
end

示例

a1 = [1,2,3,4,5] a2 = [5,6,7,8] a3 = [5] a = [a1,a2,a3] heavy_lifter(a) #=> [5]

<强>解释

以下是该方法的工作原理：

选择最小的数组（wee_one）。为简化说明，假设它是a。
的第一个元素
将wee_one转换为计数哈希counts，其中counts[e] = 1为wee_one的每个元素。

遍历其余数组。
在处理数组时，
counts的键将被删除。

完成所有计算后，counts.keys等于所有数组的交集。

处理nbr_reqd数组后（包括wee_one），counts[k]等于已找到包含k的数组的数量。显然，如果counts[k] < nbr_reqd，密钥k可以从counts中移除（但我们不会删除此类密钥，直到我们引起他们的注意，或者最后）。

假设我们现在要处理偏移b处的数组nbr_reqd，意味着nbr_reqd数组已被处理（包括偏移零处的wee_one）。对于e的每个元素b，我们获得cnt = counts[e]。有四种可能性：

cnt == nil，在这种情况下，没有什么可做的;

cnt < nbr_reqd，在这种情况下，密钥e已从counts中移除;

cnt == nbr_reqd，表示已处理的所有先前数组中都存在e，在这种情况下，我们执行counts[k] = cnt + 1;和

cnt == nbr_read+1，意味着e已存在于之前处理过的所有数组中，并且与已处理过的e中的另一个b重复，在这种情况下，没有将要完成。

nbr_reqd递增1，并为下一个数组重复该过程。

处理完所有数组后，剩下的就是删除k counts中的每个键counts[k] < nbr_reqd。

可爱方法

def cutie(a) a.reduce(:&) end

测试数据

def test(mx, *sizes) sizes.map { |sz| Array.new(sz) { rand(mx) } } end

例如：

test(10,5,6,7) #=> [[9, 1, 5, 1, 1], [0, 8, 7, 8, 5, 0], [5, 1, 7, 6, 7, 9, 5]]

基准代码

require 'benchmark' def bench(tst) Benchmark.bm(12) do |bm| bm.report 'cutie' do cutie(tst) end bm.report 'heavy_lifter' do heavy_lifter(tst) end end end

基准测试结果

tst = test(1_000_000, 400_000, 600_000, 800_000) cutie(tst).size #=> 81929 cutie(tst).sort == heavy_lifter(tst).size #=> true bench(tst) user system total real cutie 1.610000 0.030000 1.640000 ( 1.639736) heavy_lifter 1.800000 0.020000 1.820000 ( 1.824281)

sizes = (700_000..890_000).step(10_000).to_a #=> [700000, 710000, 720000, 730000, 740000, # 750000, 760000, 770000, 780000, 790000, # 800000, 810000, 820000, 830000, 840000, # 850000, 860000, 870000, 880000, 890000] tst = test(1_000_000, *sizes) bench(tst) user system total real cutie 14.090000 0.440000 14.530000 ( 14.679101) heavy_lifter 5.830000 0.030000 5.860000 ( 5.935438)

二维数组的交点

2 个答案: