如何根据python中嵌套列表中的公共元素计算子列表的数量?

时间:2018-02-23 11:31:50

标签: python list nested sublist

我有一个像这样的列表列表:[[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]。如何计算两个以上列表的子列表?例如,此处[2, 3] and [3, 4]将是前3个列表的子列表中的列表。我想摆脱它们。

3 个答案:

答案 0 :(得分:2)

这种理解应该这样做:

data = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
solution = [i for i in data if sum([1 for j in data if set(i).issubset(set(j))]) < 3]

答案 1 :(得分:1)

set_list = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]
check_list = [[2, 3], [3, 4]]
sublist_to_list = {}

for set in set_list:
    for i, sublist in enumerate(check_list):
        count = 0
        for element in sublist:
            if element in set:
                count += 1

        if count == len(sublist):
            if i not in sublist_to_list:
                sublist_to_list[i] = [set]
            else:
                sublist_to_list[i].append(set)

print(sublist_to_list)

输出:{0: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3]], 1: [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [3, 4]]}

  • 表示[2,3]是[[1,2,3,4,5,6,7,8,9],[1,2,3,4,5],[2,3]的子集,4,5,6,7],[2,3]]
  • 和[3,4]是[[1,2,3,4,5,6,7,8,9],[1,2,3,4,5],[2,3]的子集4,5,6,7],[3,4]]

答案 2 :(得分:0)

您可以先创建一个获取列表子列表的函数:

def sublists(lst):
    length = len(lst)

    for size in range(1, length + 1):
        for start in range(length - size + 1):
            yield lst[start:start+size]

其工作原理如下:

>>> list(sublists([1, 2, 3, 4, 5]))
[[1], [2], [3], [4], [5], [1, 2], [2, 3], [3, 4], [4, 5], [1, 2, 3], [2, 3, 4], [3, 4, 5], [1, 2, 3, 4], [2, 3, 4, 5], [1, 2, 3, 4, 5]]

然后您可以使用它将所有子列表列表索引收集到collections.defaultdict

from collections import defaultdict

lsts = [[1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6, 7], [2, 3], [3, 4]]

d = defaultdict(list)
for i, lst in enumerate(lsts):
    subs = sublists(lst)
    while True:
        try:
            curr = tuple(next(subs))
            d[curr].append(i)
        except StopIteration:
            break

其中包含子列表的元组键,列表索引为值。

然后,要确定在所有列表中出现两次以上的子列表,您可以检查所有索引的集合是否长度超过两个:

print([list(k) for k, v in d.items() if len(set(v)) > 2])

将提供以下子列表:

[[2], [3], [4], [5], [2, 3], [3, 4], [4, 5], [2, 3, 4], [3, 4, 5], [2, 3, 4, 5]]