按连续的公共元素拆分列表

时间:2015-01-05 09:09:06

标签: python

我有以下列表,其中只包含两个字符' N'和' C'

ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']

我想要做的是提取连续的" C"并返回列表中的索引。

产生类似

的东西
  chunk1 = [('C', 'C', 'C', 'C'), [3,4,5,6]]
  chunk2 = [('C', 'C'), [8,9]]

  # and when there's no C it returns empty list.

我如何在Python中实现这一目标?

我尝试了这个但是并没有像我希望的那样做:

from itertools import groupby
from operator import itemgetter
tmp = (list(g) for k, g in groupby(enumerate(ls), itemgetter(1)) if k == 'C')
zip(*tmp)

3 个答案:

答案 0 :(得分:5)

zip(*...)移到列表理解中:

import itertools as IT
import operator

ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']

[list(zip(*g))[::-1] 
 for k, g in IT.groupby(enumerate(ls), operator.itemgetter(1)) 
 if k == 'C']

产量

[[('C', 'C', 'C', 'C'), (3, 4, 5, 6)], [('C', 'C'), (8, 9)]]

在Python2中,list(zip(...))可以替换为zip(...),但由于在Python3 zip中返回迭代器,我们需要list(zip(...))。要使解决方案与Python2和Python3兼容,请在此处使用list(zip(...))

答案 1 :(得分:2)

使用生成器功能。您需要做的就是在解压缩组时撤消group。 所以使用yield zip(*group)[::-1]

from itertools import groupby
from operator import itemgetter
def solve(ls):
    for key, group in groupby(enumerate(ls), itemgetter(1)):
        if key =='C':
            yield zip(*group)[::-1]

ls =  ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']
print list(solve(ls))


[[('C', 'C', 'C', 'C'), (3, 4, 5, 6)], [('C', 'C'), (8, 9)]]

答案 2 :(得分:1)

ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']

def whereMyCharsAt(haystack, needle):
    start = None
    for ii, char in enumerate(haystack):
        if char == needle:
            if start is None:
                start = ii
        else:
            if start is not None:
                yield [needle] * (ii - start), range(start, ii)
                start = None

    if start is not None:
        yield [needle] * (len(haystack) - start), range(start, len(haystack))

for indexes in whereMyCharsAt(ls, 'C'):
    print indexes

打印:

(['C', 'C', 'C', 'C'], [3, 4, 5, 6])
(['C', 'C'], [8, 9])