,嗨我有一个名为data的嵌套列表:
data=[('A',2),('B',2),('B',4),('B',6),('B',8),('B',6),('B',4),('B',3),('C',10),('C',10),('C',10),('D',12),('E',12),('F',10),('F',8),('F',6)]
average=[]
我想要每个相同字母的平均值:
预期产出:
average=[('A',2),('B',5),('C',10),('D',12),('E',12),('F',8)]
有什么建议吗?提前谢谢!
答案 0 :(得分:3)
这是一个带有defaultdict的选项:
from collections import defaultdict
avg = defaultdict(lambda :{'count': 0, 'sum': 0})
# calculate the sum and count for each key
for k, v in data:
avg[k]['count'] += 1
avg[k]['sum'] += v
# calculate the average
[(k, v['sum']/v['count']) for k, v in avg.items()]
#[('A', 2.0),
# ('D', 12.0),
# ('F', 8.0),
# ('E', 12.0),
# ('B', 4.714285714285714),
# ('C', 10.0)]
答案 1 :(得分:2)
尝试使用groupby
from itertools import groupby
data_ = [(n,[i[1] for i in g]) for n,g in groupby(data, key = lambda x:x[0])]
result = [(i,float(sum(j))/float(len(j))) for i,j in data_]
结果
[('A', 2.0),
('B', 4.714285714285714),
('C', 10.0),
('D', 12.0),
('E', 12.0),
('F', 8.0)]
答案 2 :(得分:0)
您可能会考虑使用更合适的工具来处理此类数据。例如,如果您使用pandas及其groupby
and mean
的组合,则此任务变得很简单:
import pandas as pd
df = pd.DataFrame(data, columns=['letter', 'number'])
print(df)
# letter number
# 0 A 2
# 1 B 2
# 2 B 4
# 3 B 6
# 4 B 8
# 5 B 6
# 6 B 4
# 7 B 3
# 8 C 10
# 9 C 10
# 10 C 10
# 11 D 12
# 12 E 12
# 13 F 10
# 14 F 8
# 15 F 6
print(df.groupby('letter').mean())
# number
# letter
# A 2.000000
# B 4.714286
# C 10.000000
# D 12.000000
# E 12.000000
# F 8.000000
print(df.groupby('letter').mean().round().astype(int))
# number
# letter
# A 2
# B 5
# C 10
# D 12
# E 12
# F 8
您可以按以下方式获取元组列表:
averages = df.groupby('letter').mean().round().astype(int)
result = list(result.to_records())
print(result)
# [('A', 2), ('B', 5), ('C', 10), ('D', 12), ('E', 12), ('F', 8)]