Question

我正在测试itertools.groupby()并尝试将这些群组作为列表但无法弄清楚如何使其发挥作用。

使用此处的示例，在How do I use Python's itertools.groupby()?

中

from itertools import groupby

things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
         ("vehicle", "speed boat"), ("vehicle", "school bus")]

我试过（python 3.5）：

g = groupby(things, lambda x: x[0])
ll = list(g)
list(tuple(ll[0])[1])

我认为我应该将第一组（“动物”）作为列表['bear', 'duck']。但我只是在REPL上找到一个空列表。

我做错了什么？

我应该如何将所有三个组都提取为列表？

Answer 1

如果您只想要没有按键的群组，则需要实现群组生成器，per the docs：

由于源是共享的，因此当groupby（）对象处于高级时，前一个组将不再可见。因此，如果以后需要该数据，则应将其存储为列表。

这意味着，当您尝试list - groupby生成器首先使用ll = list(g)时，在转换单个组生成器之前，除最后一个组生成器之外的所有生成器都将无效/为空

（请注意，list只是一个选项; tuple或任何其他容器也可以使用。

所以为了正确地做到这一点，你必须确保list ify每个组生成器，然后再继续下一个：

from operator import itemgetter  # Nicer than ad-hoc lambdas

# Make the key, group generator
gen = groupby(things, key=itemgetter(0))

# Strip the keys; you only care about the group generators
# In Python 2, you'd use future_builtins.map, because a non-generator map would break
groups = map(itemgetter(1), gen)

# Convert them to list one by one before the next group is pulled
groups = map(list, groups)

# And listify the result (to actually run out the generator and get all your
# results, assuming you need them as a list
groups = list(groups)

作为一个单行：

groups = list(map(list, map(itemgetter(1), groupby(things, key=itemgetter(0)))))

或者因为这很多map变得相当丑陋/非Pythonic，而列表推导让我们做了一些漂亮的东西，比如解包以获取命名值，我们可以简化为：

groups = [list(g) for k, g in groupby(things, key=itemgetter(0))]

Answer 2

您可以按如下方式使用列表推导：

from itertools import groupby

things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
         ("vehicle", "speed boat"), ("vehicle", "school bus")]


g = groupby(things, lambda x: x[0])
answer = [list(group[1]) for group in g]
print(answer)

<强>输出

[[('animal', 'bear'), ('animal', 'duck')],
 [('plant', 'cactus')],
 [('vehicle', 'speed boat'), ('vehicle', 'school bus')]]

Answer 3

在groupby上的Python文档中引用：

itertools.groupby(iterable, key=None)
创建一个返回的迭代器 可迭代的连续键和组。关键是一个功能计算每个元素的关键值。如果未指定或为None， key默认为identity函数并返回该元素不变。通常，可迭代需要已经对其进行排序相同的关键功能。

>>> from itertools import groupby
>>> 
>>> things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
         ("vehicle", "speed boat"), ("vehicle", "school bus")]
>>> 
>>> 
>>> for _, g in groupby(things, lambda x:x[0]):
    print(list(g))

[('animal', 'bear'), ('animal', 'duck')]
[('plant', 'cactus')]
[('vehicle', 'speed boat'), ('vehicle', 'school bus')]
>>>
>>> from operator import itemgetter
>>> l = [list(g) for _, g in groupby(things, itemgetter(0))]
>>> l
[[('animal', 'bear'), ('animal', 'duck')], [('plant', 'cactus')], [('vehicle', 'speed boat'), ('vehicle', 'school bus')]]
>>> from collections import defaultdict
>>> 
>>> d = defaultdict(list)
>>>
>>> for k,v in groupby(things, itemgetter(0)):
    for sub in v:
        for item in sub:
            if item != k:
                d[k].append(item)


>>> d
defaultdict(<class 'list'>, {'animal': ['bear', 'duck'], 'plant': ['cactus'], 'vehicle': ['speed boat', 'school bus']})

如何将“groupby（）”生成的组作为列表？

3 个答案: