我正在测试itertools.groupby()
并尝试将这些群组作为列表但无法弄清楚如何使其发挥作用。
使用此处的示例,在How do I use Python's itertools.groupby()?
中from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
我试过(python 3.5):
g = groupby(things, lambda x: x[0])
ll = list(g)
list(tuple(ll[0])[1])
我认为我应该将第一组(“动物”)作为列表['bear', 'duck']
。但我只是在REPL上找到一个空列表。
我做错了什么?
我应该如何将所有三个组都提取为列表?
答案 0 :(得分:2)
如果您只想要没有按键的群组,则需要实现群组生成器,per the docs:
由于源是共享的,因此当groupby()对象处于高级时,前一个组将不再可见。因此,如果以后需要该数据,则应将其存储为列表。
这意味着,当您尝试list
- groupby
生成器首先使用ll = list(g)
时,在转换单个组生成器之前,除最后一个组生成器之外的所有生成器都将无效/为空
(请注意,list
只是一个选项; tuple
或任何其他容器也可以使用。
所以为了正确地做到这一点,你必须确保list
ify每个组生成器,然后再继续下一个:
from operator import itemgetter # Nicer than ad-hoc lambdas
# Make the key, group generator
gen = groupby(things, key=itemgetter(0))
# Strip the keys; you only care about the group generators
# In Python 2, you'd use future_builtins.map, because a non-generator map would break
groups = map(itemgetter(1), gen)
# Convert them to list one by one before the next group is pulled
groups = map(list, groups)
# And listify the result (to actually run out the generator and get all your
# results, assuming you need them as a list
groups = list(groups)
作为一个单行:
groups = list(map(list, map(itemgetter(1), groupby(things, key=itemgetter(0)))))
或者因为这很多map
变得相当丑陋/非Pythonic,而列表推导让我们做了一些漂亮的东西,比如解包以获取命名值,我们可以简化为:
groups = [list(g) for k, g in groupby(things, key=itemgetter(0))]
答案 1 :(得分:1)
您可以按如下方式使用列表推导:
from itertools import groupby
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
g = groupby(things, lambda x: x[0])
answer = [list(group[1]) for group in g]
print(answer)
<强>输出强>
[[('animal', 'bear'), ('animal', 'duck')],
[('plant', 'cactus')],
[('vehicle', 'speed boat'), ('vehicle', 'school bus')]]
答案 2 :(得分:0)
在groupby
上的Python文档中引用:
itertools.groupby(iterable, key=None)
创建一个返回的迭代器 可迭代的连续键和组。关键是一个功能 计算每个元素的关键值。如果未指定或为None, key默认为identity函数并返回该元素 不变。通常,可迭代需要已经对其进行排序 相同的关键功能。
>>> from itertools import groupby
>>>
>>> things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"),
("vehicle", "speed boat"), ("vehicle", "school bus")]
>>>
>>>
>>> for _, g in groupby(things, lambda x:x[0]):
print(list(g))
[('animal', 'bear'), ('animal', 'duck')]
[('plant', 'cactus')]
[('vehicle', 'speed boat'), ('vehicle', 'school bus')]
>>>
>>> from operator import itemgetter
>>> l = [list(g) for _, g in groupby(things, itemgetter(0))]
>>> l
[[('animal', 'bear'), ('animal', 'duck')], [('plant', 'cactus')], [('vehicle', 'speed boat'), ('vehicle', 'school bus')]]
>>> from collections import defaultdict
>>>
>>> d = defaultdict(list)
>>>
>>> for k,v in groupby(things, itemgetter(0)):
for sub in v:
for item in sub:
if item != k:
d[k].append(item)
>>> d
defaultdict(<class 'list'>, {'animal': ['bear', 'duck'], 'plant': ['cactus'], 'vehicle': ['speed boat', 'school bus']})