按顺序保存连续索引

时间:2016-12-30 20:03:08

标签: python itertools

我正在分析如下例所示的序列中发生的事件。 它显示了一个元组列表,其中包含有关数据框中类型和索引的元素。 我想保存所有索引,如果它们属于同一类型,只要类型不按顺序改变。

l=[('question', 0),
   ('response', 1),
   ('response', 2),
   ('response', 3),
   ('response', 4),
   ('response', 5),
   ('response', 6),
   ('response', 7),
   ('response', 8),
   ('response', 9),
   ('response', 10),
   ('response', 11),
   ('question', 12),
   ('response', 13),
   ('response', 14),
   ('response', 15),
   ('question', 16),
   ('response', 17),
   ('question', 18),
   ('response', 19),
   ('question', 20),
   ('response', 21),
   ('question', 22)
  ]

期望的输出:

[('query', 0),
 ('response', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]),
 ('query', [12]),
 ('response', [13, 14, 15]),
 ('query', [16]),
 ('response', [17]),
 ('query', [18]),
 ('response', [19]),
 ('query', [20]),
 ('response', [21])]

这是我的解决方案。有没有更好的方法呢?

def fxn(listitem):
    newlist = None
    collected_items = []
    current_comm_type = listitem[0][0]
    for element in listitem:
        if len(collected_items) == 0:
            collected_items.append(listitem[0])
        elif element[0] == current_comm_type:
            newlist[1].extend([element[1]])
        else:
            if not newlist:
                current_comm_type = element[0]
                newlist = [current_comm_type]
                newlist.append([element[1]])
            else:
                collected_items.append(tuple(newlist))
                current_comm_type = element[0]
                newlist = [current_comm_type]
                newlist.append([element[1]])
            # collected_items.append(newlist)
    return collected_items

fxn(l)

2 个答案:

答案 0 :(得分:4)

以下是使用itertools.groupby列表理解进行此操作的一种方法:

from itertools import groupby

r = [(k, [y for _, y in g]) for k, g in groupby(l, lambda x: x[0])]
print(r)
# [('question', [0]), ('response', [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]), ('question', [12]), ('response', [13, 14, 15]), ('question', [16]), ('response', [17]), ('question', [18]), ('response', [19]), ('question', [20]), ('response', [21]), ('question', [22])]

答案 1 :(得分:1)

以下是作为发电机的解决方案:

def my_fxn(input_list):
    output = None
    for key, value in input_list:
        if output is None or key != output[0]:
            if output is not None:
                yield output
            output = (key, [value])
        else:
            output[1].append(value)
    yield output