我正在使用python 3.我正在使用的函数如下:
def sub_combinations(segment):
if len(segment) == 1:
yield (segment,)
else:
for j in sub_combinations(segment[1:]):
yield ((segment[0],),)+j
for k in range(len(j)):
yield (((segment[0],)+j[k]),) + (j[:k]) +(j[k+1:])
这个函数的一个版本:
(1,2,3,4,5)的输出如下:
((1,), (2,), (3,), (4,), (5,))
((1, 2), (3,), (4,), (5,))
((1, 3), (2,), (4,), (5,))
((1, 4), (2,), (3,), (5,)) *
((1, 5), (2,), (3,), (4,)) *
((1,), (2, 3), (4,), (5,))
((1, 2, 3), (4,), (5,))
((1, 4), (2, 3), (5,)) *
((1, 5), (2, 3), (4,)) *
((1,), (2, 4), (3,), (5,))
((1, 2, 4), (3,), (5,))
((1, 3), (2, 4), (5,))
((1, 5), (2, 4), (3,)) *
((1,), (2, 5), (3,), (4,)) *
((1, 2, 5), (3,), (4,)) *
((1, 3), (2, 5), (4,)) *
((1, 4), (2, 5), (3,)) *
((1,), (2,), (3, 4), (5,))
((1, 2), (3, 4), (5,))
((1, 3, 4), (2,), (5,))
((1, 5), (2,), (3, 4)) *
((1,), (2, 3, 4), (5,))
((1, 2, 3, 4), (5,))
((1, 5), (2, 3, 4)) *
((1,), (2, 5), (3, 4)) *
((1, 2, 5), (3, 4)) *
((1, 3, 4), (2, 5)) *
((1,), (2,), (3, 5), (4,))
((1, 2), (3, 5), (4,))
((1, 3, 5), (2,), (4,))
((1, 4), (2,), (3, 5)) *
((1,), (2, 3, 5), (4,))
((1, 2, 3, 5), (4,))
((1, 4), (2, 3, 5)) *
((1,), (2, 4), (3, 5))
((1, 2, 4), (3, 5))
((1, 3, 5), (2, 4))
((1,), (2,), (3,), (4, 5))
((1, 2), (3,), (4, 5))
((1, 3), (2,), (4, 5))
((1, 4, 5), (2,), (3,)) *
((1,), (2, 3), (4, 5))
((1, 2, 3), (4, 5))
((1, 4, 5), (2, 3)) *
((1,), (2, 4, 5), (3,))
((1, 2, 4, 5), (3,))
((1, 3), (2, 4, 5))
((1,), (2,), (3, 4, 5))
((1, 2), (3, 4, 5))
((1, 3, 4, 5), (2,))
((1,), (2, 3, 4, 5))
((1, 2, 3, 4, 5),)
问题在于,如果我使用较大的元组,则函数sub_combinations会返回大量数据,并且计算时间太长。为了解决这个问题,我想通过添加额外的参数来限制返回的数据量。例如,sub_combinations((1,2,3,4,5),2)应返回上面的数据,但没有标记星号的元组。这些被删除是因为元组中的连续值之间的偏移大于2.例如,包含(1,4),(1,5)或(2,5)的行和(1,2,5)之类的行等等,都被丢弃了。
该行
for k in range(len(j))
需要调整以放弃这些线,但我还没弄清楚如何。有什么建议吗?
巴里答案 0 :(得分:1)
我认为以下更改会产生您要查找的输出:
def sub_combinations(segment, max_offset=None):
data = tuple([e] for e in segment)
def _sub_combinations(segment):
if len(segment) == 1:
yield (segment,)
else:
for j in _sub_combinations(segment[1:]):
yield ((segment[0],),)+j
for k in range(len(j)):
if max_offset and data.index(j[k][0]) - data.index(segment[0]) > max_offset:
break
yield (((segment[0],)+j[k]),) + (j[:k]) +(j[k+1:])
for combination in _sub_combinations(data):
yield tuple(tuple(e[0] for e in t) for t in combination)
这里的想法是你突破k
循环而不是产生一个偏移量大于max_offset
的元组。