我有如下列表:
list=[['BMW Z4', 'TEST', 18, '2016-09-26'],
['BMW Z4', 'TEST', 144, '2014-10-30'],
['BMW 335i', 'TEST', 144, '2013-09-26'],
['BMW 335i', 'TEST', 360, '2014-08-31'],
['BMW 335i', 'TEST', 360, '2017-08-31'],
['BMW 550xd', 'TEST', 18, '2016-10-30'],
['BMW 550xd', 'TEST', 36, '2014-10-30']]
我正在尝试创建:
list2=[['BMW Z4', 'TEST', 162, '2016-09-26','2014-10-30'],
['BMW 335i', 'TEST', 864, '2017-08-31','2013-09-26'],
['BMW 550xd', 'TEST', 54, '2016-10-30','2014-10-30']]
您是否有任何建议如何使用Python函数获取类似list2的表?
答案 0 :(得分:2)
您可以使用itertools.groupby()
:
from itertools import groupby
lst = [['BMW Z4', 'TEST', 18, '2016-09-26'],
['BMW Z4', 'TEST', 144, '2014-10-30'],
['BMW 335i', 'TEST', 144, '2013-09-26'],
['BMW 335i', 'TEST', 360, '2014-08-31'],
['BMW 335i', 'TEST', 360, '2017-08-31'],
['BMW 550xd', 'TEST', 18, '2016-10-30'],
['BMW 550xd', 'TEST', 36, '2014-10-30']]
lst2 = []
for k, g in groupby(lst, lambda x: x[0]):
g = list(g)
lst2.append([k, "TEST", sum(x[2] for x in g), max(x[3] for x in g),
min(x[3] for x in g)])
print(lst2)
输出:
[['BMW Z4', 'TEST', 162, '2016-09-26', '2014-10-30'],
['BMW 335i', 'TEST', 864, '2017-08-31', '2013-09-26'],
['BMW 550xd', 'TEST', 54, '2016-10-30', '2014-10-30']]
答案 1 :(得分:2)
您可以使用Pandas来执行此操作
import pandas as pd
list1=[['BMW Z4', 'TEST', 18, '2016-09-26'],
['BMW Z4', 'TEST', 144, '2014-10-30'],
['BMW 335i', 'TEST', 144, '2013-09-26'],
['BMW 335i', 'TEST', 360, '2014-08-31'],
['BMW 335i', 'TEST', 360, '2017-08-31'],
['BMW 550xd', 'TEST', 18, '2016-10-30'],
['BMW 550xd', 'TEST', 36, '2014-10-30']]
result = pd.DataFrame(list1).groupby(0, as_index=False).agg({1:'first', 2:'sum', 3:['max', 'min']}).values
print(result)
哪个会给你:
[['BMW 335i' 'TEST' 864 '2017-08-31' '2013-09-26']
['BMW 550xd' 'TEST' 54 '2016-10-30' '2014-10-30']
['BMW Z4' 'TEST' 162 '2016-09-26' '2014-10-30']]
(请注意,您不应该为变量命名' list',因为这会覆盖内置类型)
答案 2 :(得分:1)
您也可以使用pandas
import pandas as pd
import numpy as np
df = pd.DataFrame(l)
0 1 2 3
0 BMW Z4 TEST 18 2016-09-26
1 BMW Z4 TEST 144 2014-10-30
2 BMW 335i TEST 144 2013-09-26
3 BMW 335i TEST 360 2014-08-31
4 BMW 335i TEST 360 2017-08-31
5 BMW 550xd TEST 18 2016-10-30
6 BMW 550xd TEST 36 2014-10-30
l2 = df.groupby(0).agg({1: 'first', 2:np.sum, 3: [np.max, np.min]}).reset_index().values.tolist()
l2
[['BMW 335i', 'TEST', 864, '2017-08-31', '2013-09-26'],
['BMW 550xd', 'TEST', 54, '2016-10-30', '2014-10-30'],
['BMW Z4', 'TEST', 162, '2016-09-26', '2014-10-30']]
另外,请勿拨打您的列表list
答案 3 :(得分:1)
您可以使用defaultdict
:
from collections import defaultdict
data = [
['BMW Z4', 'TEST', 18, '2016-09-26'],
['BMW Z4', 'TEST', 144, '2014-10-30'],
['BMW 335i', 'TEST', 144, '2013-09-26'],
['BMW 335i', 'TEST', 360, '2014-08-31'],
['BMW 335i', 'TEST', 360, '2017-08-31'],
['BMW 550xd', 'TEST', 18, '2016-10-30'],
['BMW 550xd', 'TEST', 36, '2014-10-30'],
]
d = defaultdict(lambda: {'sum': 0, 'dates': set()})
for row in data:
d[row[0]]['sum'] += row[2]
d[row[0]]['dates'].add(row[3])
result = [
[key, 'TEST', value['sum']] + sorted(value['dates'], reverse=True)
for key, value in d.items()
]
顺便说一下,使用list
作为变量的名称并不好。