Question

这里的故事我有两个清单：

list_one=[1,2,9,9,9,3,4,9,9,9,9,2]
list_two=["A","B","C","D","A","E","F","G","H","Word1","Word2"]

我想在list_one中找到连续9的标记，以便我可以从list_two获得相应的字符串，我已尝试过：

group_list_one= [(k, sum(1 for i in g),pdn.index(k)) for k,g in groupby(list_one)]

我希望得到每个元组中前9个的索引，然后尝试从那里开始，但那不起作用..

我能在这做什么？ P.S。：我看过itertools的文档，但对我来说似乎很模糊。提前致谢

编辑：预期输出是（键，出现，index_of_first_occurance）

之类的东西

[(9, 3, 2), (9, 4, 7)]

Answer 1

根据您的预期输出判断，尝试一下：

from itertools import groupby

list_one=[1,2,9,9,9,3,4,9,9,9,9,2]
list_two=["A","B","C","D","A","E","F","G","H","Word1","Word2"]
data = zip(list_one, list_two)
i = 0
out = []

for key, group in groupby(data, lambda x: x[0]):
        number, word = next(group)
        elems = len(list(group)) + 1
        if number == 9 and elems > 1:
            out.append((key, elems, i))
        i += elems

print out

<强>输出：

[(9, 3, 2), (9, 4, 7)]

但是如果你真的想要这样的输出：

[(9, 3, 'C'), (9, 4, 'G')]

然后看看这个片段：

from itertools import groupby

list_one=[1,2,9,9,9,3,4,9,9,9,9,2]
list_two=["A","B","C","D","A","E","F","G","H","Word1","Word2"]
data = zip(list_one, list_two)
out = []

for key, group in groupby(data, lambda x: x[0]):
    number, word = next(group)
    elems = len(list(group)) + 1
    if number == 9 and elems > 1:
        out.append((key, elems, word))

print out

Answer 2

好的，我有oneliner解决方案。这很难看，但请耐心等待。

让我们考虑一下这个问题。我们有一个列表，我们想要使用itertools.groupby总结。 groupby为我们提供了一系列密钥和重复迭代。在这个阶段我们无法计算指数，但我们可以很容易地找到出现次数。

[(key, len(list(it))) for (key, it) in itertools.groupby(list_one)]

现在，真正的问题是我们想要计算与旧数据相关的索引。在大多数oneliner常用函数中，我们只检查当前状态。但是，有一个功能让我们可以一瞥过去 - reduce。

reduce的作用是遍历迭代器并使用函数的最后结果和新项执行函数。例如reduce(lambda x,y: x*y, [2,3,4])将计算2 * 3 = 6，然后6 * 4 = 24并返回24.此外，您可以为x而不是第一项选择另一个首字母。

让我们在这里使用它 - 对于每个项目，索引将是最后一个索引+最后出现的数量。为了获得有效列表，我们将使用[（0,0,0）]作为初始值。（我们最终摆脱它。）

reduce(lambda lst,item: lst + [(item[0], item[1], lst[-1][1] + lst[-1][-1])], 
       [(key, len(list(it))) for (key, it) in itertools.groupby(list_one)], 
       [(0,0,0)])[1:]

如果我们不添加初始值，我们可以总结到目前为止出现的次数。

reduce(lambda lst,item: lst + [(item[0], item[1], sum(map(lambda i: i[1], lst)))],
       [(key, len(list(it))) for (key, it) in itertools.groupby(list_one)], [])

当然它给了我们所有的数字。如果我们只想要9个，我们可以将整个内容包装在filter：

中

filter(lambda item: item[0] == 9, ... )

Answer 3

嗯，这可能不是最优雅的解决方案，但这里有：

g = groupby(enumerate(list_one), lambda x:x[1])
l = [(x[0], list(x[1])) for x in g if x[0] == 9]
[(x[0], len(x[1]), x[1][0][0]) for x in l]

给出了

[(9, 3, 2), (9, 4, 7)]

Answer 4

这看起来像是一个太复杂的问题，无法坚持列表理解。

element_index = 0 #the index in list_one of the first element in a group
for element, occurrences in itertools.groupby(list_one):
    count = sum(1 for i in occurrences)
    yield (element, count, element_index)
    element_index += count

如果你想要消除element_index变量，请考虑一下cumulative_sum函数需要做什么，它的值取决于所有先前已迭代的值。

如何使用itertools.groupby（）获取每个项的索引和出现次数

4 个答案: