如何将列表中的连续元素拆分为子列表

时间:2019-04-30 05:17:55

标签: python arrays pandas list split

我有以下列表:

indices_to_remove: [0,1,2,3,..,600,800,801,802,....,1200,1600,1601,1602,...,1800]

我基本上有3个连续索引的子集:

  1. 0-600
  2. 800-1200
  3. 1600-1800

我想创建3个不同的小列表,它们仅包含连续的数字。

预期结果:

indices_to_remove_1 : [0,1,2,3,....,600]
indices_to_remove_2 : [800,801,802,....,1200]
indices_to_remove_3 : [1600,1601,1602,....., 1800]

P.S:数字是任意和随机的;此外,我可能会遇到3个或更少的子集。

3 个答案:

答案 0 :(得分:3)

我喜欢使用generators解决此类问题。您可以这样做:

分割非连续数据:

def split_non_consequtive(data):
    data = iter(data)
    val = next(data)
    chunk = []
    try:
        while True:
            chunk.append(val)
            val = next(data)
            if val != chunk[-1] + 1:
                yield chunk
                chunk = []
    except StopIteration:
        if chunk:
            yield chunk

测试代码:

indices_to_remove = (
        list(range(0, 11)) +
        list(range(80, 91)) +
        list(range(160, 171))
)

for i in split_non_consequtive(indices_to_remove):
    print(i)

结果:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170]

答案 1 :(得分:1)

另一种方法是使用more_itertools.consecutive_groups:  (以@Stephen的列表为例):

import more_itertools as mit
for group in mit.consecutive_groups(indices_to_remove):
    print(list(group))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
[80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90]
[160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170]

答案 2 :(得分:0)

无需使其复杂,只需解决以下问题即可:

def chunk_lists_(data_):

    consecutive_list = []

    for chunks in range(len(data_)):

        try:

            #check consecutiveness
            if data_[chunks + 1] - data_[chunks] == 1:

                #check if it's already in list
                if data_[chunks] not in consecutive_list:
                    consecutive_list.append(data_[chunks])

                #add last one too
                consecutive_list.append(data_[chunks + 1])

            else:

                #yield here and empty list
                yield consecutive_list
                consecutive_list = []
        except Exception:
            pass
    yield consecutive_list
  

测试:

#Stephen's list 
print(list(chunk_lists_(list(range(0, 11)) +
        list(range(80, 91)) +
        list(range(160, 171)))))
  

输出:

[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90], [160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170]]