如何将“groupby()”生成的组作为列表?

时间:2016-01-28 14:34:39

标签: python

我正在测试itertools.groupby()并尝试将这些群组作为列表但无法弄清楚如何使其发挥作用。

使用此处的示例,在How do I use Python's itertools.groupby()?

from itertools import groupby

things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
         ("vehicle", "speed boat"), ("vehicle", "school bus")]

我试过(python 3.5):

g = groupby(things, lambda x: x[0])
ll = list(g)
list(tuple(ll[0])[1])

我认为我应该将第一组(“动物”)作为列表['bear', 'duck']。但我只是在REPL上找到一个空列表。

我做错了什么?

我应该如何将所有三个组都提取为列表?

3 个答案:

答案 0 :(得分:2)

如果您只想要没有按键的群组,则需要实现群组生成器,per the docs

  

由于源是共享的,因此当groupby()对象处于高级时,前一个组将不再可见。因此,如果以后需要该数据,则应将其存储为列表。

这意味着,当您尝试list - groupby生成器首先使用ll = list(g)时,在转换单个组生成器之前,除最后一个组生成器之外的所有生成器都将无效/为空

(请注意,list只是一个选项; tuple或任何其他容器也可以使用。

所以为了正确地做到这一点,你必须确保list ify每个组生成器,然后再继续下一个:

from operator import itemgetter  # Nicer than ad-hoc lambdas

# Make the key, group generator
gen = groupby(things, key=itemgetter(0))

# Strip the keys; you only care about the group generators
# In Python 2, you'd use future_builtins.map, because a non-generator map would break
groups = map(itemgetter(1), gen)

# Convert them to list one by one before the next group is pulled
groups = map(list, groups)

# And listify the result (to actually run out the generator and get all your
# results, assuming you need them as a list
groups = list(groups)

作为一个单行:

groups = list(map(list, map(itemgetter(1), groupby(things, key=itemgetter(0)))))

或者因为这很多map变得相当丑陋/非Pythonic,而列表推导让我们做了一些漂亮的东西,比如解包以获取命名值,我们可以简化为:

groups = [list(g) for k, g in groupby(things, key=itemgetter(0))]

答案 1 :(得分:1)

您可以按如下方式使用列表推导:

from itertools import groupby

things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
         ("vehicle", "speed boat"), ("vehicle", "school bus")]


g = groupby(things, lambda x: x[0])
answer = [list(group[1]) for group in g]
print(answer)

<强>输出

[[('animal', 'bear'), ('animal', 'duck')],
 [('plant', 'cactus')],
 [('vehicle', 'speed boat'), ('vehicle', 'school bus')]]

答案 2 :(得分:0)

groupby上的Python文档中引用:

  

itertools.groupby(iterable, key=None)
  创建一个返回的迭代器   可迭代的连续键和组。关键是一个功能   计算每个元素的关键值。如果未指定或为None,   key默认为identity函数并返回该元素   不变。通常,可迭代需要已经对其进行排序   相同的关键功能。

>>> from itertools import groupby
>>> 
>>> things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
         ("vehicle", "speed boat"), ("vehicle", "school bus")]
>>> 
>>> 
>>> for _, g in groupby(things, lambda x:x[0]):
    print(list(g))

[('animal', 'bear'), ('animal', 'duck')]
[('plant', 'cactus')]
[('vehicle', 'speed boat'), ('vehicle', 'school bus')]
>>>
>>> from operator import itemgetter
>>> l = [list(g) for _, g in groupby(things, itemgetter(0))]
>>> l
[[('animal', 'bear'), ('animal', 'duck')], [('plant', 'cactus')], [('vehicle', 'speed boat'), ('vehicle', 'school bus')]]
>>> from collections import defaultdict
>>> 
>>> d = defaultdict(list)
>>>
>>> for k,v in groupby(things, itemgetter(0)):
    for sub in v:
        for item in sub:
            if item != k:
                d[k].append(item)


>>> d
defaultdict(<class 'list'>, {'animal': ['bear', 'duck'], 'plant': ['cactus'], 'vehicle': ['speed boat', 'school bus']})