Question

我有一个复杂的问题就是手头编辑一个数组。我有一个数组，其中一些元素是其他元素的子字符串。我想删除所有子字符串并仅保留超集/字符串。即Array =＆gt; ['1', '1 1', '1 1 1', '1 1 1 2', '1 2 3 1', '1 2', '2 3'] 操作后我应该有一个清理过的数组=＆gt; ['1 1 1 2', '1 2 3 1']

是否有一种有效的算法来实现同样的目标？

Answer 1

此方法使用一些数组数学从数组中删除自身，然后检查它是否显示为子字符串。我不知道这是多么高效。

我改为使用delete_if，因为它可以提高性能，因为您可以在找到子字符串时缩短数组，从而使后续检查稍快一些。

更新：当数组包含重复项时，Cary Swoveland发现了一个问题。我已经添加了一个uniq来重复数组，尽管如果重复一个元素会发生什么应该不完全清楚，是否应该删除因为它们是彼此的子串？我已经解决了这个问题，假设重复导致输出中只显示一个项目，但这可能是错误的。

Answer 2

它使用更少的内存，执行更少的计算。这个将以两种方式删除子串，循环将更少。带来了

             user       system     total       real
    First    0.000000   0.000000   0.000000 (  0.000076)
    Second   0.010000   0.000000   0.010000 (  0.000037)
    Third    0.000000   0.000000   0.000000 (  0.000019)

上面提到的是上面提到的2个算法（第一个和第二个）和第一个（第三个）的基准结果。

array = ['1 1 1', '1', '1 1', '1 1 1 2', '1 2 3 1', '1 2', '2 3', '1 2 3', '1 1 1']

i1 = 0
arr_len = array.length
last_index = arr_len - 1

while i1 <= last_index
  w1 = array[i1]
  i2 = i1 + 1
  while i2 <= last_index
    w2 = array[i2]
    # If w2 is a subset of w1
    if w1.include? w2
      # Delete from index i2
      array.delete_at(i2)
      # Decrement the array_length as one element is deleted
      arr_len -= 1
      # Decrement last index, as one element is deleted
      last_index -= 1
      next
    end
    # If w1 comes out to be a subset of w2
    if w2.include? w1
      # Delete the value from that index
      array.delete_at(i1)
      # Decrement the array_length as one element is deleted
      arr_len -= 1
      # Decrement last index, as one element is deleted
      last_index -= 1
      # Reset value of w1 as it is deleted in this operation
      w1 = array[i1]
      # Reset index of 2nd loop to start matching again
      i2 = i1 + 1
      # Move next from here only
      next
    end
    i2 += 1
  end
  i1 += 1
end

Answer 3

这是一种在找到子字符串时删除子字符串的方法。

a = ['1', '1 1', '1 1 1', '1 1 1 2', '1 2 3 1', '1 2', '2 3']

b = a.dup
b.size.times do
  first, *rest = b
  (rest.any? { |t| t.include? first }) ? b.shift : b.rotate!
end
b #=> ["1 1 1 2", "1 2 3 1"]

要了解发生了什么，请插入

puts "first=\"#{first}\n, rest=#{rest}"

在first,*rest = b之后

。这打印出以下内容（在我重新格式化之前）。

first="1",       rest=["1 1", "1 1 1", "1 1 1 2", "1 2 3 1", "1 2", "2 3"]
first="1 1",     rest=["1 1 1", "1 1 1 2", "1 2 3 1", "1 2", "2 3"]
first="1 1 1",   rest=["1 1 1 2", "1 2 3 1", "1 2", "2 3"]
first="1 1 1 2", rest=["1 2 3 1", "1 2", "2 3"]
first="1 2 3 1", rest=["1 2", "2 3", "1 1 1 2"]
first="1 2",     rest=["2 3", "1 1 1 2", "1 2 3 1"]
first="2 3",     rest=["1 1 1 2", "1 2 3 1"]

高效删除Ruby中数组中其他元素的所有子串

3 个答案: