我在Python中有以下词典列表。
[{"country": "IE", "values": ["Server1-17.6650", "Server3-78.6064", "Server2-3.7286"]}, {"country": "CA", "values": ["Server1-100.0000"]}, {"country": "DE", "values": ["Server2-100.0000"]}, {"country": "JP", "values": ["Server2-100.0000"]}, {"country": "IT", "values": ["Server1-100.0000"]}, {"country": "US", "values": ["Server1-6.3158", "Server3-15.7895", "Server2-77.8947", "Server1-5.5556", "Server3-2.7778", "Server2-91.6667", "Server1-12.6145", "Server3-86.8043", "Server2-0.5811"]}, {"country": "CZ", "values": ["Server1-100.0000"]}, {"country": None, "values": ["Server1-100.0000", "Server2-100.0000", "Server2-100.0000", "Server1-100.0000"]}, {"country": "A", "values": ["Server2-100.0000"]}, {"country": "IL", "values": ["Server1-100.0000"]}, {"country": "BR", "values": ["Server2-100.0000"]}, {"country": "KP", "values": ["Server1-100.0000"]}, {"country": "SG", "values": ["Server1-79.2000", "Server2-20.8000"]}, {"country": "ES", "values": ["Server1-100.0000"]}]
现在对于每个values
,如果列表中重复了服务器名称,我必须在服务器的-
之后平均它的值。基本上对于上面的列表,最终输出变为。
[{"country": "IE", "values": ["Server1-17.6650", "Server3-78.6064", "Server2-3.7286"]}, {"country": "CA", "values": ["Server1-100.0000"]}, {"country": "DE", "values": ["Server2-100.0000"]}, {"country": "JP", "values": ["Server2-100.0000"]}, {"country": "IT", "values": ["Server1-100.0000"]}, {"country": "US", "values": ["Server1-8.1619", "Server3-35.1238", "Server2-56.7141"]}, {"country": "CZ", "values": ["Server1-100.0000"]}, {"country": None, "values": ["Server1-100.0000", "Server2-100.0000", "Server2-100.0000", "Server1-100.0000"]}, {"country": "AU", "values": ["Server2-100.0000"]}, {"country": "IL", "values": ["Server1-100.0000"]}, {"country": "BR", "values": ["Server2-100.0000"]}, {"country": "KP", "values": ["Server1-100.0000"]}, {"country": "SG", "values": ["Server1-79.2000", "Server2-20.8000"]}, {"country": "ES", "values": ["Server1-100.0000"]}]
我在Python中尝试了以下代码
for key_dict in resp:
for i, value in enumerate(key_dict['values']):
for j, new_value in enumerate(key_dict['values']):
if value[:value.index('-')] == new_value[:new_value.index('-')]:
key_dict['values'][i] = value[:value.index('-')] + str(float(value[value.index('-'):]) + float(new_value[new_value.index('-'):]))
del key_dict['values'][j]
但这不会产生我需要的结果。有人可以在python中指出如何做到这一点。
答案 0 :(得分:5)
这是对正确的数据结构微不足道的问题之一,没有痛苦的问题。如果values
是一个字典映射服务器名称到数字列表,而不是一个大的字符串列表,这将很容易:
如果您可以首先控制值的到达方式,则应该这样做。如果你不能,你可能想手动转换它们。像这样:
for key_dict in resp:
new_values = {}
for value in key_dict['values']:
name, number = value.split('-', 1)
new_values.setdefault(name, []).append(float(number))
key_dict['values'] = new_values
现在,平均它们是微不足道的:
for key_dict in resp:
averages = {}
for name, numbers in key_dict['values'].items():
averages[name] = sum(numbers) / len(numbers)
key_dict['values'] = averages
如果你真的需要在最后把它变成一个字符串,你可以:
for key_dict in resp:
key_dict['values'] = ['{}-{}'.format(name, value)
for name, value in key_dict['values'].items()]
当然,如果您真的想要,可以将这些内容整合在一起:
for key_dict in resp:
values = {}
for value in key_dict['values']:
name, number = value.split('-', 1)
values.setdefault(name, []).append(float(number))
values = ['{}-{}'.format(name, sum(numbers)/len(numbers))
for name, numbers in values.items()]
key_dict['values'] = values
答案 1 :(得分:2)
您可以在此处使用groupby
:
import numpy as np
from itertools import groupby
def average_servers(server_list):
post_split = [x.split('-') for x in server_list]
averages = []
for server, data in groupby(sorted(post_split), lambda x: x[0]):
cur_average = np.mean([float(x[1]) for x in list(data)])
averages.append('{}-{}'.format(server, cur_average))
return averages
然后应用该函数生成一个新的字符串列表,作为values
键的值:
for entry in your_data_structure:
entry['values'] = average_servers(entry['values'])