遍历嵌套列表并计算元素的平均值

时间:2018-10-23 01:20:00

标签: python iteration nested-lists

使用Riot的API,我正在开发一个应用程序,该应用程序分析来自英雄联盟比赛历史的数据。


我有一个包含商品名称购买时间(以秒为单位)

的列表
df.set_index(['ID','group_id']).stack().sort_values(0).reset_index([0,1]).reset_index(drop=True)

   ID group_id   0
0   0        A   5
1   1        B   6
2   2        C   7
3   3        D   8
4   4        E   9
5   0        A  10
6   1        B  11
7   2        C  12
8   3        D  13
9   4        E  14

我正试图将其转换为包含商品名称平均时间的商品的唯一列表

对于此示例,这就是我想要将列表转换为的内容:

item_list =
[['Boots of Speed', 50], 
['Health Potion', 60], 
['Health Potion', 80],
['Dorans Blade', 120],  
['Dorans Ring', 180], 
['Dorans Blade', 200], 
['Dorans Ring', 210]]

我尝试的解决方法是创建一个空字典,遍历列表,将字典键设置为项目名称,并将平均时间设置为键值。

['Boots of Speed', 50]
['Health Potion', 70]
['Dorans Blade', 160]
['Dorans Ring', 195]

问题在于,在初始化变量 dict [item_name] 之前,我将尝试对其进行计算。


在这一点上,我有点卡住了。任何指针或帮助将不胜感激。

3 个答案:

答案 0 :(得分:4)

您可以使用setdefault

item_list = [['Boots of Speed', 50],
             ['Health Potion', 60],
             ['Health Potion', 80],
             ['Dorans Blade', 120],
             ['Dorans Ring', 180],
             ['Dorans Blade', 200],
             ['Dorans Ring', 210]]

result = {}
for item, count in item_list:
    result.setdefault(item, []).append(count)

print([[key, sum(value) / len(value) ] for key, value in result.items()])

或者作为替代选择,使用集合模块中的defaultdict

from collections import defaultdict

item_list = [['Boots of Speed', 50],
             ['Health Potion', 60],
             ['Health Potion', 80],
             ['Dorans Blade', 120],
             ['Dorans Ring', 180],
             ['Dorans Blade', 200],
             ['Dorans Ring', 210]]

result = defaultdict(list)
for item, count in item_list:
    result[item].append(count)

print([[key, sum(value) / len(value) ] for key, value in result.items()])

输出

[['Dorans Blade', 160.0], ['Boots of Speed', 50.0], ['Health Potion', 70.0], ['Dorans Ring', 195.0]]

答案 1 :(得分:2)

我将首先填写字典,对于每个item_name,我将有一个time_of_purchase值列表。完成后,我将遍历字典(键,列表)对,并计算每个列表的平均值。

item_list = [['Boots of Speed', 50],
['Health Potion', 60],
['Health Potion', 80],
['Dorans Blade', 120],
['Dorans Ring', 180],
['Dorans Blade', 200],
['Dorans Ring', 210]]

# Fill the dictionary
d = {}
for item in item_list:
    item_name, time_of_purchase = item
    if item_name not in d:
        d[item_name] = []
    d[item_name].append(time_of_purchase)

# Now calculate and print the average
retlist = []
for item_name, list_of_times in d.items():
    new_entry = [
        item_name,
        sum(list_of_times) // len(list_of_times),
    ]
    retlist.append(new_entry)
print retlist

Daniel的解决方案以一种更加Python化和高效的方式实现了同样的目的。

答案 2 :(得分:0)

您的方法有两个问题,一个是您确定的,另一个是如果该项目出现3次,则平均值计算不正确。要解决此问题,一种方法是对时间求和,但也要分别记录发生的次数,然后计算平均值作为第二步。

item_list = [['Boots of Speed', 50],
['Health Potion', 60],
['Health Potion', 80],
['Dorans Blade', 120],
['Dorans Ring', 180],
['Dorans Blade', 200],
['Dorans Blade', 200],
['Dorans Blade', 200],
['Dorans Ring', 210]]

item_dict = {}
for item in item_list:
    item_name = item[0]
    time_of_purchase = item[1]
    if (item_name in item_dict):
        # Add the duplicate item in
        item_dict[item_name] = item_dict[item_name][0] + time_of_purchase, item_dict[item_name][1] + 1
    else:
        # First time recording this item
        item_dict[item_name] = (time_of_purchase, 1)

for item_name in item_dict.keys():
    purchase_time = item_dict[item_name][0]
    purchase_count= item_dict[item_name][1]
    print("%-15s - %u" % (item_name, purchase_time/purchase_count))