我的问题与this question完全相同。我有字符的数组(列表)。我想从该列表中获取所有可能的序列组合,但字符数限制(例如:最多2个字符)。此外,在排列行中不能重复单个字符:
chars = ['a', 'b', 'c', 'd']
# output
output = [['a', 'b', 'c', 'd'],
['ab', 'c', 'd'],
['a', 'bc', 'd'],
['a', 'b', 'cd'],
['ab', 'cd'],
['abc', 'd'], # this one will be exempted
['a', 'bcd'], # this one will be exempted
['abcd']] # this one will be exempted
我知道我可以在生成和构建序列时检查条件以省略超限字符组合。但它会增加运行时间。我的目的是减少现有的执行时间。
如果没有字符数限制,组合将生成为2 ^(N-1)。如果列表超过15个字符,则执行程序将花费很长时间。因此,我想减少字符数限制的组合数。
优先考虑的是表现。我已经研究并试了两天没有任何成功。
答案 0 :(得分:2)
一种方法是迭代输入列表并逐步构建组合。在每个步骤中,下一个字符从输入列表中获取并添加到先前生成的组合中。
from collections import defaultdict
def make_combinations(seq, maxlen):
# memo is a dict of {length_of_last_word: list_of_combinations}
memo = defaultdict(list)
memo[1] = [[seq[0]]] # put the first character into the memo
seq_iter = iter(seq)
next(seq_iter) # skip the first character
for char in seq_iter:
new_memo = defaultdict(list)
# iterate over the memo and expand it
for wordlen, combos in memo.items():
# add the current character as a separate word
new_memo[1].extend(combo + [char] for combo in combos)
# if the maximum word length isn't reached yet, add a character to the last word
if wordlen < maxlen:
word = combos[0][-1] + char
new_memo[wordlen+1] = newcombos = []
for combo in combos:
combo[-1] = word # overwrite the last word with a longer one
newcombos.append(combo)
memo = new_memo
# flatten the memo into a list and return it
return [combo for combos in memo.values() for combo in combos]
输出:
[['a', 'b', 'c', 'd'], ['ab', 'c', 'd'], ['a', 'bc', 'd'],
['a', 'b', 'cd'], ['ab', 'cd']]
对于短输入,此实现比原始itertools.product
方法慢:
input: a b c d
maxlen: 2
iterations: 10000
itertools.product: 0.11653625800136069 seconds
make_combinations: 0.16573870600041118 seconds
但是当输入列表更长时,它会快速启动:
input: a b c d e f g h i j k
maxlen: 2
iterations: 10000
itertools.product: 6.9087735799985240 seconds
make_combinations: 1.2037671390007745 seconds
答案 1 :(得分:1)
通常,更容易生成大的组合/置换列表,然后过滤结果以获得所需的输出。您可以使用递归生成器函数来获取组合,然后过滤并加入结果:
chars = ['a', 'b', 'c', 'd']
def get_combos(c):
if len(c) == 1:
yield c
else:
yield c
for i in range(len(c)-1):
yield from get_combos([c[d]+c[d+1] if d == i else c[d] if d < i else c[d+1] for d in range(len(c)-1)])
final_listing = list(get_combos(chars))
last_results = list(filter(lambda x:all(len(c) < 3 for c in x), [a for i, a in enumerate(final_listing) if a not in final_listing[:i]]))
输出:
[['a', 'b', 'c', 'd'], ['ab', 'c', 'd'], ['ab', 'cd'], ['a', 'bc', 'd'], ['a', 'b', 'cd']]