在计算

时间:2018-02-05 06:04:03

标签: python python-3.x combinations itertools

给定列表和排除元素,是否可以忽略包含这些元素的组合的计算?

示例1

鉴于l = [1, 2, 3, 4, 5],我想计算size 4的所有组合,并在计算之前排除包含(1, 3)的组合。

结果将是:

    All results:            Wanted results:

    [1, 2, 3, 4]            [1, 2, 4, 5]
    [1, 2, 3, 5]            [2, 3, 4, 5]
    [1, 2, 4, 5]
    [1, 3, 4, 5]
    [2, 3, 4, 5]

已删除包含 1和3 的所有组合。

示例2

@Eric Duminil提出

l = [1, 2, 3, 4, 5, 6]size 4

的结果
  • 排除第二列中的(1, 2, 3)
  • 排除第三列中的(1, 2)

    All results:        Wanted results 1            Wanted results 2
                        (Excluding [1, 2, 3]):      (Excluding [1, 2])
    
    [1, 2, 3, 4]        [1, 2, 4, 5]                [1, 3, 4, 5]
    [1, 2, 3, 5]        [1, 2, 4, 6]                [1, 3, 4, 6]
    [1, 2, 3, 6]        [1, 2, 5, 6]                [1, 3, 5, 6]
    [1, 2, 4, 5]        [1, 3, 4, 5]                [1, 4, 5, 6]
    [1, 2, 4, 6]        [1, 3, 4, 6]                [2, 3, 4, 5]
    [1, 2, 5, 6]        [1, 3, 5, 6]                [2, 3, 4, 6]
    [1, 3, 4, 5]        [1, 4, 5, 6]                [2, 3, 5, 6]
    [1, 3, 4, 6]        [2, 3, 4, 5]                [2, 4, 5, 6]
    [1, 3, 5, 6]        [2, 3, 4, 6]                [3, 4, 5, 6]
    [1, 4, 5, 6]        [2, 3, 5, 6]                                
    [2, 3, 4, 5]        [2, 4, 5, 6]                                
    [2, 3, 4, 6]        [3, 4, 5, 6]                                
    [2, 3, 5, 6]           
    [2, 4, 5, 6]           
    [3, 4, 5, 6]        
    

所有包含 1和2和3 的组合已从想要的结果中删除1

所有包含 1和2 的组合已从想要的结果2中删除

我有一个更大的组合来计算,但它需要花费很多时间,我想使用这些排除来减少这段时间。

尝试过的解决方案

使用方法1,仍然计算组合

使用方法2,我尝试修改combinations function,但在计算之前找不到合适的方法来忽略我的排除列表。

            Method 1                    |               Method 2
                                        |               
def main():                             |   def combinations(iterable, r):
    l = list(range(1, 6))               |       pool = tuple(iterable)
    comb = combinations(l, 4)           |       n = len(pool)
                                        |       if r > n:
    for i in comb:                      |           return
        if set([1, 3]).issubset(i):     |       indices = list(range(r))
            continue                    |       yield tuple(pool[i] for i in indices)
        else                            |       while True:
            process()                   |           for i in reversed(range(r)):
                                        |               if indices[i] != i + n - r:
                                        |                   break
                                        |               else:
                                        |                   return
                                        |           indices[i] += 1
                                        |           for j in range(i+1, r):
                                        |               indices[j] = indices[j-1] + 1
                                        |           yield tuple(pool[i] for i in indices)

编辑:

首先,谢谢大家的帮助,我忘了提供有关约束的更多细节。

  • 输出的顺序不相关,例如,如果结果为[1, 2, 4, 5] [2, 3, 4, 5][2, 3, 4, 5] [1, 2, 4, 5],则不重要。

  • 组合的元素应该(如果可能的话)排序,[1, 2, 4, 5] [2, 3, 4, 5]而不是[2, 1, 5, 4] [3, 2, 4, 5],但它并不重要,因为组合可以在之后排序。

  • 排除列表是一起组合中不应出现的所有项目的列表。例如,如果我的排除列表为(1, 2, 3),则不应计算包含 1和2和3 的所有组合。但是,允许使用 1和2而不是3 的组合。在这种情况下,如果我排除包含(1, 2)(1, 2, 3)的组合,则它完全无用,因为(1, 2, 3)过滤的所有组合都已被(1, 2)过滤

  • 必须有多个排除列表,因为我对我的组合使用了多个约束。

经过测试的答案

@tobias_k 此解决方案将排除列表(1, 2, 3)视为OR排除含义(1, 2), (2, 3) and (1, 3)将被排除,如果我理解得很好,这在一个案例中有用但在我当前的问题中没有用,我修改了问题以提供更多细节,抱歉混淆。在您的回答中,我不能仅使用列表(1, 2)(1, 3)作为您指定的排除项。然而,该解决方案的最大优点是允许多次排除。

@Kasramvd和@mikuszefski 你的解决方案非常接近我想要的,如果它确实包含多个排除列表,那就是答案。

由于

7 个答案:

答案 0 :(得分:5)

(事实证明这并不完全符合OP的要求。仍然留在这里,因为它可能会帮助其他人。)

要包含互斥元素,您可以将这些元素包含在列表中的列表中,获取combinations,然后product子列表组合:

>>> from itertools import combinations, product
>>> l = [[1, 3], [2], [4], [5]]
>>> [c for c in combinations(l, 4)]
[([1, 3], [2], [4], [5])]
>>> [p for c in combinations(l, 4) for p in product(*c)]
[(1, 2, 4, 5), (3, 2, 4, 5)]

一个更复杂的例子:

>>> l = [[1, 3], [2, 4, 5], [6], [7]]
>>> [c for c in combinations(l, 3)]
[([1, 3], [2, 4, 5], [6]),
 ([1, 3], [2, 4, 5], [7]),
 ([1, 3], [6], [7]),
 ([2, 4, 5], [6], [7])]
>>> [p for c in combinations(l, 3) for p in product(*c)]
[(1, 2, 6),
 (1, 4, 6),
 ... 13 more ...
 (4, 6, 7),
 (5, 6, 7)]

这不会产生任何后来被过滤掉的“垃圾”组合。但是,它假定您最多只需要每个“独占”组中的一个元素,例如在第二个示例中,它不仅可以阻止与2,4,5的组合,还可以阻止与2,44,52,5的组合。此外,不可能(或至少不容易)只拥有1,31,5中的一个,但允许3,5。 (有可能将它扩展到那些情况,但我还不确定是否以及如何。)

您可以将它包装在一个函数中,从您的(假定的)格式中导出稍微不同的输入格式并返回一致的生成器表达式。在这里,lst是元素列表,r每个组合的项目数,exclude_groups是互斥元素组的列表:

from itertools import combinations, product

def comb_with_excludes(lst, r, exclude_groups):
    ex_set = {e for es in exclude_groups for e in es}
    tmp = exclude_groups + [[x] for x in lst if x not in ex_set]
    return (p for c in combinations(tmp, r) for p in product(*c))

lst = [1, 2, 3, 4, 5, 6, 7]
excludes = [[1, 3], [2, 4, 5]]
for x in comb_with_excludes(lst, 3, excludes):
    print(x)

答案 1 :(得分:4)

从算法的角度来看,您可以分离有效项的排除和重置,并分别计算每个组的组合,并根据需求长度连接结果。这种方法将完全拒绝一次性包括所有被排除的项目,但将省略实际的订单。

from itertools import combinations

def comb_with_exclude(iterable, comb_num, excludes):
    iterable = tuple(iterable)
    ex_len = len(excludes)
    n = len(iterable)

    if comb_num < ex_len or comb_num > n:
        yield from combinations(iterable, comb_num)

    else:
        rest = [i for i in iterable if not i in excludes]
        ex_comb_rang = range(0, ex_len)
        rest_comb_range = range(comb_num, comb_num - ex_len, -1)
        # sum of these pairs is equal to the comb_num
        pairs = zip(ex_comb_rang, rest_comb_range)

        for i, j in pairs:
            for p in combinations(excludes, i):
                for k in combinations(rest, j):
                    yield k + p
       """
       Note that instead of those nested loops you could wrap the combinations within a product function like following:
       for p, k in product(combinations(excludes, i), combinations(rest, j)):
            yield k + p
       """

演示:

l = [1, 2, 3, 4, 5, 6, 7, 8]
ex = [2, 5, 6]
print(list(comb_with_exclude(l, 6, ex)))

[(1, 3, 4, 7, 8, 2), (1, 3, 4, 7, 8, 5), (1, 3, 4, 7, 8, 6), (1, 3, 4, 7, 2, 5), (1, 3, 4, 8, 2, 5), (1, 3, 7, 8, 2, 5), (1, 4, 7, 8, 2, 5), (3, 4, 7, 8, 2, 5), (1, 3, 4, 7, 2, 6), (1, 3, 4, 8, 2, 6), (1, 3, 7, 8, 2, 6), (1, 4, 7, 8, 2, 6), (3, 4, 7, 8, 2, 6), (1, 3, 4, 7, 5, 6), (1, 3, 4, 8, 5, 6), (1, 3, 7, 8, 5, 6), (1, 4, 7, 8, 5, 6), (3, 4, 7, 8, 5, 6)]

l = [1, 2, 3, 4, 5]
ex = [1, 3]
print(list(comb_with_exclude(l, 4, ex)))

[(2, 4, 5, 1), (2, 4, 5, 3)]

Benckmark和其他答案:

结果:这种方法比其他方法更快

# this answer
In [169]: %timeit list(comb_with_exclude(lst, 3, excludes[0]))
100000 loops, best of 3: 6.47 µs per loop

# tobias_k
In [158]: %timeit list(comb_with_excludes(lst, 3, excludes))
100000 loops, best of 3: 13.1 µs per loop

# Vikas Damodar
In [166]: %timeit list(combinations_exc(lst, 3))
10000 loops, best of 3: 148 µs per loop

# mikuszefski
In [168]: %timeit list(sub_without(lst, 3, excludes[0]))
100000 loops, best of 3: 12.52 µs per loop

答案 2 :(得分:1)

我已尝试根据您的要求编辑组合:

def combinations(iterable, r):
   # combinations('ABCD', 2) --> AB AC AD BC BD CD
   # combinations(range(4), 3) --> 012 013 023 123
   pool = tuple(iterable)
   n = len(pool)
   if r > n:
      return
   indices = list(range(r))
   # yield tuple(pool[i] for i in indices)
   while True:
       for i in reversed(range(r)):
           if indices[i] != i + n - r:
               break
    else:
        return
    indices[i] += 1
    for j in range(i+1, r):
        indices[j] = indices[j-1] + 1
    # print(tuple(pool[i] for i in indices ), "hai")
    if 1 in tuple(pool[i] for i in indices ) and 3  in tuple(pool[i] for i in indices ):
        pass
    else:
        yield tuple(pool[i] for i in indices)


d = combinations(list(range(1, 6)),4)
for i in d:
   print(i)

它将返回如下内容:

  

(1,2,4,5)   (2,3,4,5)

答案 3 :(得分:1)

我在组合期间使用以下代码进行了排除,以节省第二个循环时间。你只需要将排除元素的索引作为一组传递。

更新working fiddle

from itertools import permutations

def combinations(iterable, r, combIndeciesExclusions=set()):
    pool = tuple(iterable)
    n = len(pool)
    for indices in permutations(range(n), r):
        if ( len(combIndeciesExclusions)==0 or not combIndeciesExclusions.issubset(indices)) and sorted(indices) == list(indices):
            yield tuple(pool[i] for i in indices)


l = list(range(1, 6))
comb = combinations(l, 4, set([0,2]))
print list(comb)

答案 4 :(得分:1)

(事实证明我的previous answer并不真正满足问题的限制,这是另一个问题。我将此作为一个单独的答案发布,因为方法有很大的不同,原来的答案可能是仍然帮助他人。)

您可以递归地实现此操作,每次递归之前将另一个元素添加到组合中,检查是否会违反其中一个exclude-sets。这不会生成和无效的组合,它适用于重叠的排除集(如(1,3), (1,5))和具有两个以上元素的排除集(如(2,4,5)),允许除了所有组合之外的任何组合)。

def comb_with_excludes(lst, n, excludes, i=0, taken=()):
    if n == 0:
        yield taken  # no more needed
    elif i <= len(lst) - n:
        t2 = taken + (lst[i],)  # add current element
        if not any(e.issubset(t2) for e in excludes):
            yield from comb_with_excludes(lst, n-1, excludes, i+1, t2)
        if i < len(lst) - n:  # skip current element
            yield from comb_with_excludes(lst, n, excludes, i+1, taken)

示例:

>>> lst = [1, 2, 3, 4, 5, 6]
>>> excludes = [{1, 3}, {1, 5}, {2, 4, 5}]
>>> list(comb_with_excludes(lst, 4, excludes))
[[1, 2, 4, 6], [2, 3, 4, 6], [2, 3, 5, 6], [3, 4, 5, 6]]

好吧,我现在花时间了,事实证明这比使用过滤器的生成器表达式中的itertools.combination天真地要慢得多,就像你已经做的那样:

def comb_naive(lst, r, excludes):
    return (comb for comb in itertools.combinations(lst, r)
                 if not any(e.issubset(comb) for e in excludes))

在Python中计算组合比使用库(可能在C中实现)慢,然后过滤结果。根据可以排除的组合数量,这个可能在某些情况下更快,但说实话,我有疑虑。

如果您可以itertools.combinations使用itertoolc.combinations进行子问题,那么您可以获得更好的结果,例如Kasramvd's answer,但是对于更难以进行的多个非分离排除集。一种方法可能是将列表中的元素分成两组:具有约束的那些和不具有约束的那些。然后,对两者使用def comb_with_excludes2(lst, n, excludes): wout_const = [x for x in lst if not any(x in e for e in excludes)] with_const = [x for x in lst if any(x in e for e in excludes)] k_min, k_max = max(0, n - len(wout_const)), min(n, len(with_const)) return (c1 + c2 for k in range(k_min, k_max) for c1 in itertools.combinations(with_const, k) if not any(e.issubset(c1) for e in excludes) for c2 in itertools.combinations(wout_const, n - k)) ,但仅检查那些重要的元素组合的约束。您仍然需要检查并过滤结果,但只有一部分。 (但有一点需要注意:结果不是按顺序生成的,并且所产生的组合中元素的顺序也有些混乱。)

>>> lst = [1, 2, 3, 4, 5, 6]
>>> excludes = [{1, 3}, {1, 5}, {2, 4, 5}]
>>> %timeit list(comb_with_excludes(lst, 4, excludes))
10000 loops, best of 3: 42.3 µs per loop
>>> %timeit list(comb_with_excludes2(lst, 4, excludes))
10000 loops, best of 3: 22.6 µs per loop
>>> %timeit list(comb_naive(lst, 4, excludes))
10000 loops, best of 3: 16.4 µs per loop

这已经比递归的纯Python解决方案好得多,但仍然不如上面例子的“天真”方法好:

>>> lst = list(range(20))
>>> %timeit list(comb_with_excludes(lst, 4, excludes))
10 loops, best of 3: 15.1 ms per loop
>>> %timeit list(comb_with_excludes2(lst, 4, excludes))
1000 loops, best of 3: 558 µs per loop
>>> %timeit list(comb_naive(lst, 4, excludes))
100 loops, best of 3: 5.9 ms per loop

但是,结果很大程度上取决于输入。对于更大的列表,限制只适用于其中的一些元素,这种方法实际上比天真的更快:

StringTokenizer st = new StringTokenizer(buf.readLine());

while(st.hasMoreTokens())
{
  arr[i++] = Integer.parseInt(st.nextToken());
}

答案 5 :(得分:0)

我想我的答案与其他人的答案类似,但这就是我并行摆弄的方式

from itertools import combinations, product

"""
with help from
https://stackoverflow.com/questions/374626/how-can-i-find-all-the-subsets-of-a-set-with-exactly-n-elements
https://stackoverflow.com/questions/32438350/python-merging-two-lists-with-all-possible-permutations
https://stackoverflow.com/questions/952914/making-a-flat-list-out-of-list-of-lists-in-python
"""
def sub_without( S, m, forbidden ):
    out = []
    allowed = [ s for s in S if s not in forbidden ]
    N = len( allowed )
    for k in range( len( forbidden ) ):
        addon = [ list( x ) for x in combinations( forbidden, k) ]
        if N + k >= m:
            base = [ list( x ) for x in combinations( allowed, m - k ) ]
            leveltotal = [ [ item for sublist in x for item in sublist ] for x in product( base, addon ) ]
            out += leveltotal
    return out

val = sub_without( range(6), 4, [ 1, 3, 5 ] )

for x in val:
    print sorted(x)

>>
[0, 1, 2, 4]
[0, 2, 3, 4]
[0, 2, 4, 5]
[0, 1, 2, 3]
[0, 1, 2, 5]
[0, 2, 3, 5]
[0, 1, 3, 4]
[0, 1, 4, 5]
[0, 3, 4, 5]
[1, 2, 3, 4]
[1, 2, 4, 5]
[2, 3, 4, 5]

答案 6 :(得分:0)

在算法上,您必须计算列表中不属于排除项目的项目组合,然后将排除项目的相应组合添加到其余项目的组合中。这种方法当然需要大量的检查并且需要跟踪索引,即使你在python中这样做它也不会给你一个显着的性能差异(称为Constraint satisfaction problem的缺点)。 (而不是仅使用combination计算它们并过滤掉不需要的项目。)

因此,我认为这是大多数情况下最好的方法:

In [77]: from itertools import combinations, filterfalse

In [78]: list(filterfalse({1, 3}.issubset, combinations(l, 4)))
Out[78]: [(1, 2, 4, 5), (2, 3, 4, 5)]