查找列表的所有可能的子列表

时间:2013-06-12 09:08:56

标签: python

假设我有以下列表

[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]

我想找到某个长度的所有可能的子列表,其中它们不包含一个特定的数字且不会丢失数字的顺序。

例如,所有可能的长度为6而没有12的子列表是:

[1,2,3,4,5,6]
[2,3,4,5,6,7]
[3,4,5,6,7,8]
[4,5,6,7,8,9]
[5,6,7,8,9,10]
[6,7,8,9,10,11]
[13,14,15,16,17,18]

问题是我想在一个非常大的列表中进行,我想要最快捷的方式。

使用我的方法更新:

oldlist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
newlist = []
length = 6
exclude = 12
for i in oldlist:
   if length+i>len(oldlist):
       break
   else:
       mylist.append(oldlist[i:(i+length)]
for i in newlist:
    if exclude in i:
       newlist.remove(i)

我知道这不是最好的方法,这就是为什么我需要一个更好的方法。

6 个答案:

答案 0 :(得分:10)

一个简单,非优化的解决方案

result = [sublist for sublist in 
        (lst[x:x+size] for x in range(len(lst) - size + 1))
        if item not in sublist
    ]

优化版本:

result = []
start = 0
while start < len(lst):
    try:
        end = lst.index(item, start + 1)
    except ValueError:
        end = len(lst)
    result.extend(lst[x+start:x+start+size] for x in range(end - start - size + 1))
    start = end + 1

答案 1 :(得分:7)

使用itertools.combinations

import itertools
mylist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
def contains_sublist(lst, sublst):
    n = len(sublst)
    return any((sublst == lst[i:i+n]) for i in xrange(len(lst)-n+1))
print [i for i in itertools.combinations(mylist,6) if 12 not in i and contains_sublist(mylist, list(i))]

打印:

[(1, 2, 3, 4, 5, 6), (2, 3, 4, 5, 6, 7), (3, 4, 5, 6, 7, 8), (4, 5, 6, 7, 8, 9), (5, 6, 7, 8, 9, 10), (6, 7, 8, 9, 10, 11), (13, 14, 15, 16, 17, 18)]

答案 2 :(得分:2)

我喜欢用小型可组合部件构建解决方案。写Haskell几年就能帮到你。所以我这样做......

首先,这将从所有子列表返回一个迭代器,按长度的升序排列,从空列表开始:

from itertools import chain, combinations

def all_sublists(l):
    return chain(*(combinations(l, i) for i in range(len(l) + 1)))

一般来说,我们不鼓励使用单字母变量名,但我认为在短时间内突发出高度抽象的代码,这是一件非常合理的事情。

(顺便说一句,要省略空列表,请改用range(1, len(l) + 1)。)

然后我们可以通过添加您的标准来解决您的问题:

def filtered_sublists(input_list, length, exclude):
    return (
        l for l in all_sublists(input_list)
        if len(l) == length and exclude not in l
    )

例如:

oldlist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
length = 6
exclude = 12
newlist = filtered_sublists(old_list, length, exclude)

答案 3 :(得分:1)

我能想到的最简单的方法是从列表中删除排除的数字,然后使用itertools.combinations()生成所需的子列表。这具有额外的优势,即它将迭代生成子列表。

from  itertools import combinations

def combos_with_exclusion(lst, exclude, length):
    for combo in combinations((e for e in lst if e != exclude), length):
        yield list(combo)

mylist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]

for sublist in combos_with_exclusion(mylist, 12, 6):
    print sublist

输出:

[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 7]
[1, 2, 3, 4, 5, 8]
[1, 2, 3, 4, 5, 9]
[1, 2, 3, 4, 5, 10]
[1, 2, 3, 4, 5, 11]
[1, 2, 3, 4, 5, 13]
        ...
[11, 14, 15, 16, 17, 18]
[13, 14, 15, 16, 17, 18]

答案 4 :(得分:0)

我试图以递归方式创建所有可能的列表列表。 depth参数只需要从每个列表中删除的项目数。这不是一个滑动窗口。

代码:

def sublists(input, depth):
    output= []
    if depth > 0:
        for i in range(0, len(input)):
            sub= input[0:i] + input[i+1:]
            output += [sub]
            output.extend(sublists(sub, depth-1))
    return output

示例(以交互方式键入python3):

sublists([1,2,3,4],1)

[[2,3,4],[1,3,4],[1,2,4],[1,2,3]]

sublists([1,2,3,4],2)

[[2,3,4],[3,4],[2,4],[2,3],[1,3,4],[3,4],[1,4], [1,3],[1,2,4],[2,4],[1,4],[1,2],[1,2,3],[2,3],[1,3] ],[1,2]]

sublists([1,2,3,4],3)

[[2,3,4],[3,4],[4],[3],[2,4],[4],[2],[2,3],[3], [2],[1,3,4],[3,4],[4],[3],[1,4],[4],[1],[1,3],[3], [1],[1,2,4],[2,4],[4],[2],[1,4],[4],[1],[1,2],[2], [1],[1,2,3],[2,3],[3],[2],[1,3],[3],[1],[1,2],[2], [1]]

一些边缘案例:

sublists([1,2,3,4],100)

[[2,3,4],[3,4],[4],[3],[2,4],[4],[2],[2,3],[3], [2],[1,3,4],[3,4],[4],[3],[1,4],[4],[1],[1,3],[3], [1],[1,2,4],[2,4],[4],[2],[1,4],[4],[1],[1,2],[2], [1],[1,2,3],[2,3],[3],[2],[1,3],[3],[1],[1,2],[2], [1]]

sublists([], 1)

[]

注意:列表的输出列表包含重复项。

答案 5 :(得分:0)

我有一个答案,但我认为这不是最好的:

oldlist = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
result = []
def sub_list(lst):
    if len(lst) <= 1:
        result.append(tuple(lst))
        return
    else:
        result.append(tuple(lst))
    for i in lst:
        new_lst = lst[:]
        new_lst.remove(i)
        sub_list(new_lst)
sub_list(oldlist)
newlist = set(result)    # because it have very very very many the same
                         # sublist so we need use set to remove these also 
                         # use tuple above is also the reason 
print newlist

它会得到结果,但因为它会有很多相同的子列表,所以它需要大量的内存和大量的时间。我认为这不是一个好方法。