python按关键字拆分列表

时间:2013-04-17 05:30:22

标签: python split

我有一个如下列表:

['MARK_A', 8, 7702.5, 13, 7703, 983472],
['MARK_B', 10, 7702.5, 983472],
['MARK_B', 3, 7703.5, 983472],
['MARK_B', 6, 7701.2, 983472],
['MARK_B', 5, 7704.4, 983472],
['MARK_A', 9, 7701.5, 11, 7704, 983475],
['MARK_B', 10, 7702.5, 983475],
['MARK_B', 3, 7703.5, 983475],
['MARK_B', 6, 7701.2, 983475],
['MARK_B', 5, 7704.4, 983475]]

如何将此列表拆分为2个列表,如下所示:

[['MARK_A', 8, 7702.5, 13, 7703, 983472],
['MARK_B', 10, 7702.5, 983472],
['MARK_B', 3, 7703.5, 983472],
['MARK_B', 6, 7701.2, 983472],
['MARK_B', 5, 7704.4, 983472]],
[ ['MARK_A', 9, 7701.5, 11, 7704, 983475],
['MARK_B', 10, 7702.5, 983475],
['MARK_B', 3, 7703.5, 983475],
['MARK_B', 6, 7701.2, 983475],
['MARK_B', 5, 7704.4, 983475]]

可以有任意数量的" MARK_A"在列表中后跟一个或多个" MARK_B"。我会将列表除以[-1]元素

1 个答案:

答案 0 :(得分:3)

我使用itertools.groupby来解决这类问题(获取集合的集合,将其分解为内部集合中的标记)

from itertools import groupby

class GroupbyHelper:
    def __init__(self, token):
        self.token = token
        self.count = 0

    def __call__(self, item):
        self.count += (item[0] == self.token)
        return self.count


grouped_collections = \
[list(grouped) for _, grouped in groupby(collections, GroupbyHelper("MARK_A"))]

上述代码的某种通用版本:

from itertools import groupby

class GroupbyHelper:
    def __init__(self, check_token):
        self.check_token = check_token
        self.count = 0

    def __call__(self, item):
        self.count += self.check_token(item)
        return self.count


grouped_collections = \
[list(grouped) for _, grouped in 
 groupby(collections, GroupbyHelper(lambda x: x[0] == "MARK_A"))]

使用两个迭代器:

from itertools import tee, zip_longest

iter1, iter2 = tee(i for i, item in enumerate(collections) if item[0] == 'MARK_A')
next(iter2) #advance the second iterator so we can move by range

grouped_collections = \
[collections[s:e] for s, e in zip_longest(iter1, iter2, fillvalue=len(collections))]

有时一个简单的for循环并不是那么糟糕:

grouped_collections = []
for lst in collections:
    if lst[0]=="MARK_A":
        grouped_collections.append([lst])
    else:
        grouped_collections[-1].append(lst)