我有一个字典列表,按键date
排序:
d = [{'date': datetime.strptime('2016-01-01 07:00', "%Y-%m-%d %H:%M"), 'val': 1},
{'date': datetime.strptime('2016-01-01 23:00', "%Y-%m-%d %H:%M"), 'val': 3},
{'date': datetime.strptime('2016-01-02 07:00', "%Y-%m-%d %H:%M"), 'val': 5},
{'date': datetime.strptime('2016-01-02 22:13', "%Y-%m-%d %H:%M"), 'val': 7},
{'date': datetime.strptime('2016-01-02 23:00', "%Y-%m-%d %H:%M"), 'val': 9},
{'date': datetime.strptime('2016-01-03 00:10', "%Y-%m-%d %H:%M"), 'val': 17},
{'date': datetime.strptime('2016-01-03 09:12', "%Y-%m-%d %H:%M"), 'val': 25},
{'date': datetime.strptime('2016-01-03 21:52', "%Y-%m-%d %H:%M"), 'val': 37}]
我想得到每天的最后一项(最新),所以在这种情况下它会是:
{'date': datetime.strptime('2016-01-01 23:00', "%Y-%m-%d %H:%M"), 'val': 3},
{'date': datetime.strptime('2016-01-02 23:00', "%Y-%m-%d %H:%M"), 'val': 9},
{'date': datetime.strptime('2016-01-03 21:52', "%Y-%m-%d %H:%M"), 'val': 37},
我有以下代码可以解决这个问题:
previous_item = None
wanted_data = []
for index, entry in enumerate(d):
if not previous_item:
previous_item = entry
continue
if entry['date'].date() != previous_item['date'].date():
wanted_data.append(previous_item)
previous_item = entry
#Add as well the last item
if index + 1 == len(d):
wanted_data.append(entry)
但我相信有更好更快的方法来做到这一点......此外,那非常难看。
有更多的蟒蛇方式来实现这个目标吗?
谢谢!
答案 0 :(得分:3)
假设数据已经按'date'
排序(似乎是您的情况),您可以使用itertools.groupby
按date()
进行分组,然后获取最后一项来自每个小组。
>>> d = sorted(d, key=lambda x: x["date"]) # only if not already sorted
>>> groups = itertools.groupby(d, lambda x: x["date"].date())
>>> wanted_data = [list(grp)[-1] for key, grp in groups]
>>> wanted_data
[{'date': datetime.datetime(2016, 1, 1, 23, 0), 'val': 3},
{'date': datetime.datetime(2016, 1, 2, 23, 0), 'val': 9},
{'date': datetime.datetime(2016, 1, 3, 21, 52), 'val': 37}]
请注意,这会将每个组扩展为list
。如果这太贵了,因为每个日期的条目非常多,你可以创建一个函数来从迭代器中获取最后一个条目,例如:使用reduce
(或Python 3中的functools.reduce
):
>>> last = lambda x: functools.reduce(lambda x, y: y, x)
>>> wanted_data = [last(grp) for key, grp in groups]