使用Python将非唯一实体划分为唯一集合

时间:2013-11-16 04:01:11

标签: python

我有一个项目列表,例如[1,1,1,1,2,2],我试图找到将这些项捆绑到长度为1或2的元组中的所有唯一组。例如,对于上面的组,我想找到以下10种可能的分组:

[[(1,),(1,),(1,),(1,),(2,),(2,)],
 [(1,1),(1,),(1,),(2,),(2,)],
 [(1,2),(1,),(1,),(1,),(2,)],
 [(2,2),(1,),(1,),(1,),(1,)],
 [(1,1),(1,1),(2,),(2,)],
 [(1,1),(1,2),(1,),(2,)],
 [(1,2),(1,2),(1,),(1,)],
 [(2,2),(1,1),(1,),(1,)],
 [(1,1),(1,1),(2,2)],
 [(1,1),(1,2),(1,2)]]

我一直在使用itertools,但只能设法使用它来查找唯一可能的元组(例如set(list(itertools.combinations((1,1,1,1,2,2),2))))以及任何搜索我会弹出解决方案,其中每个组的大小是恒定的和/或不考虑重复元素(example1example2)。

最终,我正在寻找适用于所有情况([1,1,1,...,1]),所有两个([2,2,2,...,2])或包含任意数量的1和2的中间组合的案例的解决方案

1 个答案:

答案 0 :(得分:4)

正如我在评论中指出的那样,输入列表的最大长度至关重要。这里的示例代码通过对整个分区的集合进行后处理(以清除重复的内容,以及用“太大”的部分清除分区)来快速解决您给出的具体示例。但对于“长”的原始列表来说,它会可怕效率低下:

def part(xs):  # generate all partitions of xs
    xs = tuple(xs)
    n = len(xs)
    def extend(i):
        if i == n:
            yield ()
            return
        this = xs[i]
        for o in extend(i+1):
            yield ((this,),) + o
            for j, p in enumerate(o):
                yield o[:j] + ((this,) + p,) + o[j+1:]
    for o in extend(0):
        yield o

def upart(xs):  # weed out dups, and partitions with a piece bigger than 2
    from collections import Counter
    seen = []
    for p in part(xs):
        if all(len(chunk) <= 2 for chunk in p):
            c = Counter(p)
            if c not in seen:
                seen.append(c)
                yield p

xs = [1,1,1,1,2,2]
for o in upart(xs):
    print o

显示您正在寻找的10个唯一分区。

BTW,xs = [1,1,1,1,1,1]产生:

((1,), (1,), (1,), (1,), (1,), (1,))
((1, 1), (1,), (1,), (1,), (1,))
((1, 1), (1, 1), (1,), (1,))
((1, 1), (1, 1), (1, 1))

自定义生成器

正如评论中所指出的,如果对一般构建块的结果进行后处理效率太低,则需要从头开始“自己动手”。这是一种节省空间的方法,通过构造(而不是后处理)构建独特的结果。这样做真的没有“一般方法” - 它需要分析手头的具体问题,编写代码来利用你能找到的任何怪癖:

def custom_gen(xs):
    from collections import Counter
    assert all(1 <= i <= 2 for i in xs)
    # There are only 5 unique pieces that can be used:
    pieces = [(1,), (2,), (1, 1), (2, 2), (1, 2)]
    countpieces = {piece: Counter(piece) for piece in pieces}

    def extend(i, n1, n2, result):
        # try all ways of extending with pieces[i];
        # there are n1 1's and n2 2's remaining to be used
        assert n1 >= 0 and n2 >= 0
        if n1 == n2 == 0:
            yield result
            return
        if i == len(pieces):  # dead end
            return
        piece = pieces[i]
        c = countpieces[piece]
        p1 = c[1]
        p2 = c[2]
        # What's the most number of this piece we could
        # possibly take?
        assert p1 or p2
        if p1:
            if p2:
                most = min(n1 // p1, n2 // p2)
            else:
                most = n1 // p1
        else:
            most = n2 // p2
        for count in range(most + 1):
            for t in extend(i+1,
                            n1 - count * p1,
                            n2 - count * p2,
                            result + [piece] * count):
                yield t

    c = Counter(xs)
    for t in extend(0, c[1], c[2], []):
        yield t

请注意,递归永远不会超过5深(无论输入列表多长时间),所以我打赌这是关于在没有深入分析问题数学的情况下可以做到最有效的。