我正在寻找一种获取列表列表并查找元素子集的最大值的优雅方法。用一个小例子更好地解释它。给出以下输入:
data = [['1','AAA','somestuff','1/5/2018'],
['1','AAA','differentstuff','1/5/2018'],
['1','AAA','evendifferent','1/10/2018'],
['2','BBB','foo','1/12/2018'],
['2','BBB','bar','1/20/2018']]
我想返回以下列表列表:
[['1','AAA','evendifferent','1/10/2018'],
['2','BBB','bar','1/20/2018']]
输出按内部列表的索引1分组,最大值基于日期(内部列表中的最后一项)。
答案 0 :(得分:1)
您需要从字符串日期中获取日期时间-否则'1/5/2018'
会比较“大”然后是'1/10/2018'
,因为'5' > '1'
是字符串。
您可以这样实现:
data = [['1','AAA','somestuff','1/5/2018'],
['1','AAA','differentstuff','1/5/2018'] ,
['1','AAA','evendifferent','1/10/2018'] ,
['2','BBB','foo','1/12/2018'] ,
['2','BBB','bar','1/20/2018']]
# group by AAA, BBB etc. into lists
from collections import defaultdict
dd = defaultdict(list)
for d in data:
dd[d[1]].append(d)
import datetime
# iterate over groups and get the maximum value of each list
for k in dd:
# for the datetime converted da
print(max ( dd[k], key = lambda x: datetime.datetime.strptime(x[-1],"%d/%M/%Y")))
文档: