有没有人知道如何利用itertools.groupby函数中的 key func参数将数据行分组为零和非零值?
简化示例:
from collections import namedtuple
from operator import attrgetter
from itertools import groupby
FakeRow = namedtuple('FakeRow', ['start_date_time', 'wear_sensor',
'part_number', 'chip_count'])
data = [
FakeRow(1,1,'999-045', 0),
FakeRow(2,1,'999-045', 4),
FakeRow(3,1,'999-045', 3),
FakeRow(3,1,'999-047', 0),
FakeRow(4,1,'999-045', 0),
FakeRow(5,1,'999-047', 1),
]
# need to groupby start date time first
unique_keys = []
groups = []
data = sorted(data, key=attrgetter('start_date_time'))
# want to group by 'chip_count' but by zero and non-zero values
for k, g in groupby(data, key=my_key_func(*args)):
groups.append(list(g))
unique_keys.append(k)
def my_key_func(*args):
'''Help itertools.groupby group by zeros, or group by anything non-zero'''
pass
所需的输出是:
groups == [
[FakeRow(1,1,'999-045', 0)],
[FakeRow(2,1,'999-045', 4),FakeRow(3,1,'999-045', 3)],
[FakeRow(3,1,'999-047', 0), FakeRow(4,1,'999-045', 0)],
[FakeRow(5,1,'999-047', 1)]
]
感谢。
答案 0 :(得分:1)
它应该像查看假行的chip_count的布尔值一样简单:
def my_key_func(fakerow):
return bool(fakerow.chip_count)
在这种情况下,您的unique_keys
将是True
或False
,这可能不是您想要的。您可能希望使用一个集update
代替fakerow.chip_count
:
unique_keys = set()
for k, g in groupby(data, key=my_key_func):
group = list(g)
groups.append(group)
unique_keys.update(fk.chip_count for fk in group)