以深度优先顺序生成数组的笛卡尔积的算法

时间:2010-09-01 19:03:46

标签: ruby algorithm cartesian-product

我正在寻找一个如何在Ruby中使用C语言或伪代码来创建可变数量的整数数组的笛卡尔积,每个数组的长度各不相同,并逐步完成结果特定订单:

所以给定,[1,2,3],[1,2,3],[1,2,3]:

[1, 1, 1]
[2, 1, 1]
[1, 2, 1]
[1, 1, 2]
[2, 2, 1]
[1, 2, 2]
[2, 1, 2]
[2, 2, 2]
[3, 1, 1]
[1, 3, 1]
etc.

而不是我见过的典型结果(包括我在下面给出的例子):

[1, 1, 1]
[2, 1, 1]
[3, 1, 1]
[1, 2, 1]
[2, 2, 1]
[3, 2, 1]
[1, 3, 1]
[2, 3, 1]
etc.

这个例子的问题在于,在尝试前两个的所有组合之前,根本不会探索第三个位置。在使用它的代码中,这意味着即使正确的答案通常(更大的等价物)1,1,2它会在找到之前检查几百万种可能性而不是几千种。

我正在处理一百万到几亿的结果集,所以生成它们然后排序在这里是不可行的并且会在第一个例子中击败排序它们的原因,这是为了更快地找到正确的答案因此,早些时候突破了笛卡儿产品。

以防万一它有助于澄清上述任何一个,现在我就是这样做的(这有正确的结果和正确的表现,但不是我想要的顺序,即它会产生如上面第二个清单中的结果):< / p>

def cartesian(a_of_a)
  a_of_a_len = a_of_a.size
  result = Array.new(a_of_a_len)
  j, k, a2, a2_len = nil, nil, nil, nil
  i = 0
  while 1 do
    j, k = i, 0
    while k < a_of_a_len
      a2 = a_of_a[k]
      a2_len = a2.size
      result[k] = a2[j % a2_len]
      j /= a2_len
      k += 1
    end

    return if j > 0
    yield result

    i += 1
  end

end

更新: 我没有说清楚我是在一个解决方案之后,在添加3之前检查1,2的所有组合,然后全部3和1,然后全部3,2和1,然后全部3,2 。换句话说,在“垂直”之前“水平”探索所有早期组合。探索这些可能性的确切顺序,即1,1,2或2,1,1,并不重要,只是在混合3之前探索所有2和1,依此类推。

3 个答案:

答案 0 :(得分:2)

在问题的精确性之后,这是修订版。我保留了之前的答案,因为它也很有用,并使用不那么复杂的顺序。

# yields the possible cartesian products of [first, *rest], where the total
# of the indices that are "distributed" is exactly +nb+ and each index doesn't
# go beyong +depth+, but at least one of them is exactly +depth+
def distribute(nb, depth, reached, first, *rest)
  from  = [nb - rest.size * depth, 0].max
  to    = [first.size-1, depth, nb].min
  from.upto(to) do |i|
    obj = first[i]
    reached ||= i == depth
    if rest.empty?
      yield [obj] if reached
    else
      distribute(nb - i, depth, reached, *rest) do |comb|
        yield [obj, *comb]
      end
    end
  end
end

def depth_first_cartesian(*arrays)
  return to_enum __method__, *arrays unless block_given?
  lengths = arrays.map(&:length)
  total = lengths.inject(:+)
  lengths.max.times do |depth|
    depth.upto(arrays.size * depth) do |nb|
      distribute(nb, depth, false, *arrays) {|c| yield c}
    end
  end
end

p depth_first_cartesian([1, 2, 3], [1, 2, 3, 4], [1, 2, 3]).to_a
# => [[1, 1, 1], [1, 1, 2], [1, 2, 1], [2, 1, 1], [1, 2, 2], [2, 1, 2], [2, 2, 1], [2, 2, 2],
#     [1, 1, 3], [1, 3, 1], [3, 1, 1], [1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 1, 2],
#     [3, 2, 1], [1, 3, 3], [2, 2, 3], [2, 3, 2], [3, 1, 3], [3, 2, 2], [3, 3, 1], [2, 3, 3],
#     [3, 2, 3], [3, 3, 2], [3, 3, 3], [1, 4, 1], [1, 4, 2], [2, 4, 1], [1, 4, 3], [2, 4, 2],
#     [3, 4, 1], [2, 4, 3], [3, 4, 2], [3, 4, 3]]

答案 1 :(得分:1)

元素[1, 1, 3]在所需输出中的位置尚不清楚。如果我的猜测是正确的,以下工作(尽管它可能会被优化)

# yields the possible cartesian products of [first, *rest], where the total
# of the indices that are "distributed" is exactly +nb+.
def distribute(nb, first, *rest)
  if rest.empty?                    # single array remaining?
    yield first.fetch(nb) {return}  # yield the right element (if there is one)
  else
    first.each_with_index do |obj, i|
      break if i > nb
      distribute(nb - i, *rest) do |comb|
        yield [obj, *comb]
      end
    end
  end
end

def strange_cartesian(*arrays, &block)
  return to_enum __method__, *arrays unless block_given?
  max = arrays.map(&:length).inject(:+)
  max.times do |nb|
    distribute(nb, *arrays, &block)
  end
end

p strange_cartesian([1, 2, 3], [1, 2, 3], [1, 2, 3]).to_a
#  => [[1, 1, 1], [1, 1, 2], [1, 2, 1], [2, 1, 1], [1, 1, 3], [1, 2, 2], [1, 3, 1], [2, 1, 2], [2, 2, 1], [3, 1, 1], [1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 2, 2], [2, 3, 1], [3, 1, 2], [3, 2, 1], [1, 3, 3], [2, 2, 3], [2, 3, 2], [3, 1, 3], [3, 2, 2], [3, 3, 1], [2, 3, 3], [3, 2, 3], [3, 3, 2], [3, 3, 3]]

注意:如果您仍在运行Ruby 1.8.6,请升级到至少1.8.7(或require 'backports'

答案 2 :(得分:1)

Hey Marc-André,cartesian宝石完全符合您的要求:

require 'cartesian'
[1,2,3].x([1,2,3]).to_a #=> [[1, 1], [1, 2], [1, 3], [2, 1], [2, 2], [2, 3], [3, 1], [3, 2], [3, 3]]

您也可以使用**(电源)运算符来简洁

for a,b,c in [1,2,3]**3 ; p [a,b,c] ; end
# output:
#    [1, 1, 1]
#    [1, 1, 2]
#    [1, 1, 3]
#    [1, 2, 1]
#    ...
#    [3, 3, 3]

该项目托管在github上,其homepage中有一个指向RDoc文档的链接。