在python中生成列表的条件乘积(组合)

时间:2013-03-01 02:54:10

标签: python tree combinatorics itertools

我希望能够生成有条件的产品。所以类似于这个答案: All combinations of a list of lists

我想使用itertools.product(*listOfLists)。但是,我的问题是从一个列表中包含一个元素意味着必须为产品查阅其他列表。

示例:

colors = ['red', 'blue', 'green']
fruits = ['apple', 'orange', 'banana']
locations = ['indoors', 'outdoors']

indoor_choices = ['bathroom', 'bedroom', 'kitchen']
green_choices = ['forest', 'light', 'dark']

在这里,我们要始终考虑每种可能的颜色,特性和位置选择。然而,在“室内”的情况下,我们也想考虑室内选择,并且在“绿色”可能的选择的情况下,我们还想选择更具体的绿色。它是一种可能性的树,其中一些分支保持分支而另一些则不分支。

所以在上面的这个愚蠢的例子中,你可以这样做一个for循环:

for c in colors:
    for f in fruits:
        for l in locations:
            # etc

但是我们遇到了两个不同的类别根据这个选择进行分支时会发生什么的问题。

一个简单的(hacky)解决方案就是手动编写条件并在其中放置循环:

for c in colors:
    for f in fruits:
        for l in locations:

            if c == 'green' and l == 'indoor':
                for gc in green_choices:
                     for ic in indoor_choices:
                         # output

            elif c == 'green':
                for gc in green_choices:
                    # output

            elif l == 'indoor':
                for gc in green_choices:
                    # output

            else:
                # output

但想象一下,当有N个列表时,其中M个有额外的分支时会感到恐惧。或者甚至更糟糕的是,嵌套附加分支......基本上这个黑客不会扩展。

有什么想法吗?事实证明这个问题看起来很难!

4 个答案:

答案 0 :(得分:6)

以下是我使用递归生成器的方法。

def prod(terms, expansions):
    if not terms: # base case
        yield ()
        return

    t = terms[0] # take the first term

    for v in expansions[t]: # expand the term, to get values
        if v not in expansions: # can the value can be expanded?
            gen = prod(terms[1:], expansions) # if not, we do a basic recursion
        else:
            gen = prod(terms[1:] + [v], expansions) # if so, we add it to terms

        for p in gen: # now we get iterate over the results of the recursive call
            yield (v,) + p # and add our value to the start

以下是您在示例中调用它来生成所需产品的方法:

expansions = {
        'colors':['red', 'blue', 'green'],
        'fruits':['apple', 'orange', 'banana'],
        'locations':['indoors', 'outdoors'],
        'indoors':['bathroom', 'bedroom', 'kitchen'],
        'green':['forest', 'light', 'dark']
    }

terms = ["colors", "locations"] # fruits omitted, to reduce the number of lines

for p in prod(terms, expansions):
    print(p)

输出:

('red', 'indoors', 'bathroom')
('red', 'indoors', 'bedroom')
('red', 'indoors', 'kitchen')
('red', 'outdoors')
('blue', 'indoors', 'bathroom')
('blue', 'indoors', 'bedroom')
('blue', 'indoors', 'kitchen')
('blue', 'outdoors')
('green', 'indoors', 'forest', 'bathroom')
('green', 'indoors', 'forest', 'bedroom')
('green', 'indoors', 'forest', 'kitchen')
('green', 'indoors', 'light', 'bathroom')
('green', 'indoors', 'light', 'bedroom')
('green', 'indoors', 'light', 'kitchen')
('green', 'indoors', 'dark', 'bathroom')
('green', 'indoors', 'dark', 'bedroom')
('green', 'indoors', 'dark', 'kitchen')
('green', 'outdoors', 'forest')
('green', 'outdoors', 'light')
('green', 'outdoors', 'dark')

答案 1 :(得分:1)

如果您真正的问题与您的示例非常相似,那么您可以将这些组合分析为四种产品:

is_green = ['green']
not_green = ['red', 'blue']
is_indoors = ['indoors']
not_indoors = ['outdoors']

p1 = itertools.product([not_green, fruits, not_indoors])
...
p2 = itertools.product([is_green, fruits, not_indoors, green_choices])
...
p3 = itertools.product([not_green, fruits, is_indoors, indoor_choices])
...
p4 = itertools.product([is_green, fruits, is_indoors, green_choices, indoor_choices])

这就是全部!

现在,如果我们想要概括,那么我们就不必制作四个“特殊”案例,我们可以捕捉某些值与它们打开的其他选择之间的关系,正如@DavidRobinson所建议的那样。

import itertools

colors = ['red', 'blue', 'green']
fruits = ['apple', 'orange', 'banana']
locations = ['indoors', 'outdoors']

indoor_choices = ('bathroom', 'bedroom', 'kitchen')
green_choices = ('forest', 'light', 'dark')

choices = [colors, fruits, locations]
more_choices = { 'indoors': indoor_choices, 'green': green_choices }
for p in itertools.product(*choices):
    m = [more_choices[k] for k in p if k in more_choices]
    for r in itertools.product([p],*m):
        print list(r[0]) + list(r[1:])

请注意,当选择和more_choices很大时,将不可避免地遇到困难。

答案 2 :(得分:1)

这是使用yield的递归实现。我认为它不像@ Blckknght的解决方案那么整洁,但它可能会有所帮助。

colors = ["red","blue","green"]
fruits = ["apple","orange", "banana"]
locations = ["indoors","outdoors"]

green_subtypes = ["forest", "light", "dark"]
indoor_locations = ["bathroom","bedroom","kitchen"]

def gen(state):
  if len(state)==0:
    for c in colors:
       s = [c]
       for y in gen(s):
         yield y
  elif len(state)==1:
    for x in fruits:
      s = state + [x]
      for y in gen(s):
        yield y
  elif len(state)==2:
    for x in locations:
      s = state + [x]
      for y in gen(s):
        yield y
  else:
    # If we're green and we haven't looped through the green options already 
    # (the check is a bit dodgy and could do with being moved into a flag inside state)
    if state[0]=='green' and len(set(state).intersection(set(green_subtypes)))==0:
      for x in green_subtypes:
        s = state + [x]
        for y in gen(s):
          yield y
    # If we're indoors and we haven't looped through the indoor options already 
    # (the check is a bit dodgy and could do with being moved into a flag inside state)
    elif state[2]=='indoors' and len(set(state).intersection(set(indoor_locations)))==0:
      for x in indoor_locations:
        s = state + [x]
        for y in gen(s):
          yield y
    else:
      yield state

for x in gen([]):
  print(x)

答案 3 :(得分:1)

我们可以在post-hoc(Python 3语法)中添加“额外”选项:

def choice_product(choices, *iterables):
    for v in itertools.product(*iterables):
        ks = set(v) & choices.keys()
        if ks:
            choice_iters = [choices[k] for k in ks]
            for p in choice_product(choices, *choice_iters):
                yield v + p
        else:
            yield v

这使用itertools.product来提高效率。

choices定义为

choices = {'indoors' : ['bathroom', 'bedroom', 'kitchen'],
           'green': ['forest', 'light', 'dark']}

这个递归:

>>> for i in choice_product({'c': 'de', 'e': 'fg'}, 'ab', 'cd'):
...     print(i)
... 
('a', 'c', 'd')
('a', 'c', 'e', 'f')
('a', 'c', 'e', 'g')
('a', 'd')
('b', 'c', 'd')
('b', 'c', 'e', 'f')
('b', 'c', 'e', 'g')
('b', 'd')