Question

我想写一个函数my_func(n,l)，对于一个正整数n，它可以有效地枚举长度为l（其中{{1} }大于l）。例如，我希望n返回my_func(2,3)。

我最初的想法是将现有代码用于正整数分区（例如this post中的[[0,0,2],[0,2,0],[2,0,0],[1,1,0],[1,0,1],[0,1,1]]），将正整数分区扩展为几个零并返回所有排列。

accel_asc()

此函数的输出是错误的，因为每个数字出现两次（或多次出现）的非负整数组合在def my_func(n, l): for ip in accel_asc(n): nic = numpy.zeros(l, dtype=int) nic[:len(ip)] = ip for p in itertools.permutations(nic): yield p的输出中出现了几次。例如，my_func返回list(my_func(2,3))。

我可以通过生成所有非负整数组成的列表，删除重复的条目，然后返回剩余的列表（而不是生成器）来纠正此问题。但这似乎效率极低，很可能会遇到内存问题。解决此问题的更好方法是什么？

编辑

我对本文的答案以及cglacet在评论中指出的another post提供的解决方案进行了快速比较。

在左边有[(1, 1, 0), (1, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1), (0, 1, 1), (2, 0, 0), (2, 0, 0), (0, 2, 0), (0, 0, 2), (0, 2, 0), (0, 0, 2)]，在右边有l=2*n。在这两种情况下，当使用l=n+1时，user2357112的第二个解决方案要比其他解决方案快。对于n<=5，由user2357112，Nathan Verzemnieks和AndyP提出的解决方案或多或少都受到限制。但是，当考虑n>5和l之间的其他关系时，结论可能会有所不同。

..........

*我最初要求非负整数分区。约瑟夫·伍德正确地指出，我实际上是在寻找整数组成，因为序列中数字的顺序对我很重要。

Answer 1

Use the stars and bars concept: pick positions to place l-1 bars between n stars, and count how many stars end up in each section:

import itertools

def diff(seq):
    return [seq[i+1] - seq[i] for i in range(len(seq)-1)]

def generator(n, l):
    for combination in itertools.combinations_with_replacement(range(n+1), l-1):
        yield [combination[0]] + diff(combination) + [n-combination[-1]]

I've used combinations_with_replacement instead of combinations here, so the index handling is a bit different from what you'd need with combinations. The code with combinations would more closely match a standard treatment of stars and bars.

Alternatively, a different way to use combinations_with_replacement: start with a list of l zeros, pick n positions with replacement from l possible positions, and add 1 to each of the chosen positions to produce an output:

def generator2(n, l):
    for combination in itertools.combinations_with_replacement(range(l), n):
        output = [0]*l
        for i in combination:
            output[i] += 1
        yield output

Answer 2

Starting from a simple recursive solution, which has the same problem as yours:

def nn_partitions(n, l):
    if n == 0:
        yield [0] * l
    else:
        for part in nn_partitions(n - 1, l):
            for i in range(l):
                new = list(part)
                new[i] += 1
                yield new

That is, for each partition for the next lower number, for each place in that partition, add 1 to the element in that place. It yields the same duplicates yours does. I remembered a trick for a similar problem, though: when you alter a partition p for n into one for n+1, fix all the elements of p to the left of the element you increase. That is, keep track of where p was modified, and never modify any of p's "descendants" to the left of that. Here's the code for that:

def _nn_partitions(n, l):
    if n == 0:
        yield [0] * l, 0
    else:
        for part, start in _nn_partitions(n - 1, l):
            for i in range(start, l):
                new = list(part)
                new[i] += 1
                yield new, i

def nn_partitions(n, l):
    for part, _ in _nn_partitions(n, l):
        yield part

It's very similar - there's just the extra parameter passed along at each step, so I added wrapper to remove that for the caller.

I haven't tested it extensively, but this appears to be reasonably fast - about 35 microseconds for nn_partitions(3, 5) and about 18s for nn_partitions(10, 20) (which yields just over 20 million partitions). (The very elegant solution from user2357112 takes about twice as long for the smaller case and about four times as long for the larger one. Edit: this refers to the first solution from that answer; the second one is faster than mine under some circumstances and slower under others.)

非负整数组成的有效枚举

..........

2 个答案: