懒惰地将N个项目分区为Python中的K个bin

时间:2012-10-30 00:12:29

标签: python combinatorics

给出一个算法(或直接的Python代码),它将N个项目集合的所有分区生成K个bin,这样每个bin至少有一个项目。在订单重要且订单无关紧要的情况下,我需要这个。

订单重要的例子

>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 2))
[([1], [2,3,4]), ([1,2], [3,4]), ([1,2,3], [4])]

>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 3))
[([1], [2], [3,4]), ([1], [2,3], [4]), ([1,2], [3], [4])]

>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 4))
[([1], [2], [3], [4])]

订单无关紧要的例子

>>> list(partition_n_in_k_bins_unordered({1,2,3,4}, 2))
[{{1}, {2,3,4}}, {{2}, {1,3,4}}, {{3}, {1,2,4}}, {{4}, {1,2,3}},
 {{1,2}, {3,4}}, {{1,3}, {2,4}}, {{1,4}, {2,3}}]

这些函数应该生成惰性迭代器/生成器,而不是列表。理想情况下,他们会使用itertools中的原语。我怀疑有一个聪明的解决方案让我望而却步。

虽然我在Python中要求这个,但我也愿意翻译一个清晰的算法。

2 个答案:

答案 0 :(得分:4)

你需要一个递归函数来解决这类问题:你取一个列表,取一个增加长度的子部分,并将相同的程序应用于n-1个列表的剩余尾部。

这是我对有序组合的看法

def partition(lista,bins):
    if len(lista)==1 or bins==1:
        yield [lista]
    elif len(lista)>1 and bins>1:
        for i in range(1,len(lista)):
            for part in partition(lista[i:],bins-1):
                if len([lista[:i]]+part)==bins:
                    yield [lista[:i]]+part

for i in partition(range(1,5),1): 
    print i
#[[1, 2, 3, 4]]

for i in partition(range(1,5),2): 
    print i
#[[1], [2, 3, 4]]
#[[1, 2], [3, 4]]
#[[1, 2, 3], [4]]

for i in partition(range(1,5),3):
    print i
#[[1], [2], [3, 4]]
#[[1], [2, 3], [4]]
#[[1, 2], [3], [4]] 

for i in partition(range(1,5),4):
    print i
#[[1], [2], [3], [4]]

答案 1 :(得分:3)

Enrico的算法,Knuth的,只有我的胶水需要粘贴一些返回列表或一组集合的东西(在案例元素不可清除的情况下作为列表列表返回)。

def kbin(l, k, ordered=True):
    """
    Return sequence ``l`` partitioned into ``k`` bins.

    Examples
    ========

    The default is to give the items in the same order, but grouped
    into k partitions:

    >>> for p in kbin(range(5), 2):
    ...     print p
    ...
    [[0], [1, 2, 3, 4]]
    [[0, 1], [2, 3, 4]]
    [[0, 1, 2], [3, 4]]
    [[0, 1, 2, 3], [4]]

    Setting ``ordered`` to None means that the order of the elements in
    the bins is irrelevant and the order of the bins is irrelevant. Though
    they are returned in a canonical order as lists of lists, all lists
    can be thought of as sets.

    >>> for p in kbin(range(3), 2, ordered=None):
    ...     print p
    ...
    [[0, 1], [2]]
    [[0], [1, 2]]
    [[0, 2], [1]]

    """
    from sympy.utilities.iterables import (
        permutations, multiset_partitions, partitions)

    def partition(lista, bins):
        #  EnricoGiampieri's partition generator from
        #  http://stackoverflow.com/questions/13131491/
        #  partition-n-items-into-k-bins-in-python-lazily
        if len(lista) == 1 or bins == 1:
            yield [lista]
        elif len(lista) > 1 and bins > 1:
            for i in range(1, len(lista)):
                for part in partition(lista[i:], bins - 1):
                    if len([lista[:i]] + part) == bins:
                        yield [lista[:i]] + part
    if ordered:
        for p in partition(l, k):
            yield p
    else:
        for p in multiset_partitions(l, k):
            yield p