我有一个元组列表,其中包含日期和公司名称。公司可以列出多个日期的信息:
[(Company A, datetime.date(1980,1,30)),
(Company A, datetime.date(1990,1,30)),
(Company B, datetime.date(1990,1,30)),
(Company B, datetime.date(2000,1,30))]
我想要做的是,列表中只包含每家公司可用的最新日期,即结果:
[(Company A, datetime.date(1990,1,30)),
(Company B, datetime.date(2000,1,30))]
有什么想法吗?
答案 0 :(得分:3)
如何使用itertools中的groupby
,然后使用max:
import datetime
x = [('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30))]
import itertools
out = []
for k,g in itertools.groupby(sorted(x, key = lambda y: y[0]), lambda y: y[0]):
out.append(max(g, key = lambda y:y[1]))
out
[('Company A', datetime.date(1990, 1, 30)),
('Company B', datetime.date(2000, 1, 30))]
答案 1 :(得分:2)
你也可以使用字典......
data = [('Company A', '1980,1,30'),
('Company A', '1990,1,30'),
('Company B', '1990,1,30'),
('Company B', '2000,1,30')]
datadict = { a:b for a,b in data }
for a, b in data:
datadict[a] = max(b, datadict[a])
print(datadict)
答案 2 :(得分:1)
以下是使用reduce()
的示例:
import datetime
company_dates = [
('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30)),
]
def reducer(acc, company_date):
try:
acc[company_date[0]] = max(acc[company_date[0]], company_date[1])
except KeyError:
acc[company_date[0]] = company_date[1]
return acc
sorted = reduce(reducer, company_dates, {})
print sorted.items()
这是使用不同功能的另一种替代解决方案:
import datetime
import operator
company_dates = [
('Company A', datetime.date(1980,1,30)),
('Company A', datetime.date(1990,1,30)),
('Company B', datetime.date(1990,1,30)),
('Company B', datetime.date(2000,1,30)),
]
sorted = sorted(company_dates, key=operator.itemgetter(0, 1), reverse=True)
unique = set([company_date[0] for company_date in sorted])
top = [next(c for c in sorted if c[0] == company) for company in unique]
print top