Question

我有以下列表：

indices_to_remove: [0,1,2,3,..,600,800,801,802,....,1200,1600,1601,1602,...,1800]

我基本上有3个连续索引的子集：

0-600
800-1200
1600-1800

我想创建3个不同的小列表，它们仅包含连续的数字。

预期结果：

indices_to_remove_1 : [0,1,2,3,....,600]
indices_to_remove_2 : [800,801,802,....,1200]
indices_to_remove_3 : [1600,1601,1602,....., 1800]

P.S：数字是任意和随机的；此外，我可能会遇到3个或更少的子集。

Answer 1

我喜欢使用generators解决此类问题。您可以这样做：

分割非连续数据：

def split_non_consequtive(data):
    data = iter(data)
    val = next(data)
    chunk = []
    try:
        while True:
            chunk.append(val)
            val = next(data)
            if val != chunk[-1] + 1:
                yield chunk
                chunk = []
    except StopIteration:
        if chunk:
            yield chunk

测试代码：

indices_to_remove = (
        list(range(0, 11)) +
        list(range(80, 91)) +
        list(range(160, 171))
)

for i in split_non_consequtive(indices_to_remove):
    print(i)

结果：

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170]

Answer 2

另一种方法是使用more_itertools.consecutive_groups：（以@Stephen的列表为例）：

import more_itertools as mit
for group in mit.consecutive_groups(indices_to_remove):
    print(list(group))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170]

Answer 3

无需使其复杂，只需解决以下问题即可：

def chunk_lists_(data_):

    consecutive_list = []

    for chunks in range(len(data_)):

        try:

            #check consecutiveness
            if data_[chunks + 1] - data_[chunks] == 1:

                #check if it's already in list
                if data_[chunks] not in consecutive_list:
                    consecutive_list.append(data_[chunks])

                #add last one too
                consecutive_list.append(data_[chunks + 1])

            else:

                #yield here and empty list
                yield consecutive_list
                consecutive_list = []
        except Exception:
            pass
    yield consecutive_list

测试：

#Stephen's list 
print(list(chunk_lists_(list(range(0, 11)) +
        list(range(80, 91)) +
        list(range(160, 171)))))

输出：

[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170]]

如何将列表中的连续元素拆分为子列表

3 个答案:

分割非连续数据：

测试代码：

结果：