如何在python中执行此聚合

时间:2015-05-06 10:09:57

标签: python dictionary data-structures aggregation

我有以下词典列表,其中包含country和相应服务器的值。

[
    {'country': 'KR', 'values': ['Server1']},
    {'country': 'IE', 'values': ['Server1', 'Server3', 'Server2']},
    {'country': 'IE', 'values': ['Server1', 'Server3']},
    {'country': 'DE', 'values': ['Server1']},
    {'country': 'DE', 'values': ['Server2']},
]

现在我需要计算特定国家/地区的每台服务器的百分比。例如,对于IE,两个列表的总和为5。因此,(2/5)*100的百分比将计算为Server1,因为Server1中有两个IE,其余为Server1,其余为{{1}在percent作为键的dict中。因此,基本上对于上述结构,输出变为。

[
    {"country": "KR", "percent": "100.0000", "values": ["Server1-100.0000"]},
    {"country": "IE", "percent": "40.000", "values": ["Server1-40.0", "Server3-40.0", "Server2-20.0"]},
    {"country" : "DE", "percent" : "50.0", "values" : ["Server1-50.0", "Server2-50.0"]},
]

我尝试了以下代码。

for i in range(len(response) - 1):
   for j in range((i+1), len(response) - 1):
     if response[i]['country'] == response[j]['country']:
       print response[i]['country'], response[j]['country']
       total = len(response[i]['values']) +  len(response[j]['values'])
       print total
       for item in response[i]['values']:
         for ktem in response[j]['values']:
           if item == ktem:
              if item == 'Server1':
                response[i]['percent'] =  200/total
              else:
                response[i][percent] = 0
              del response[j]

我坚持继续进一步让百分比部分正确。有帮助吗?

2 个答案:

答案 0 :(得分:1)

我们说你有

orig = [
    {'country': 'KR', 'values': ['Server1']},
    {'country': 'IE', 'values': ['Server1', 'Server3', 'Server2']},
    {'country': 'IE', 'values': ['Server1', 'Server3']},
    {'country': 'DE', 'values': ['Server1']},
    {'country': 'DE', 'values': ['Server2']},
]

您可以创建一个新词典,其中列出了哪些服务器位于哪些国家/地区及其计数

newDict = {}
for c in orig:
    if c['country'] not in newDict:
        newDict[c['country']] = dict()
    for s in c['values']:
        if s in newDict[c['country']]:
            newDict[c['country']][s] = newDict[c['country']][s] + 1
        else:
            newDict[c['country']][s] = 1

将采用以下形式:

{'KR': {'Server1': 1}, 
 'DE': {'Server1': 1, 'Server2': 1}, 
 'IE': {'Server1': 2, 'Server2': 1, 'Server3': 2}}

然后您可以计算百分比:

output = []
for country in newList:
    total = 0
    for server in newList[country]:
        total = total + newList[country][server]    
    output.append({"country": country, "percent": (100.0 * newList[country]['Server1'])/total})

将产生

[{'country': 'KR', 'percent': 100.0}, 
 {'country': 'DE', 'percent': 50.0}, 
 {'country': 'IE', 'percent': 40.0}]

我将此作为练习让读者优化并添加您想要的其他字段

答案 1 :(得分:1)

我有一个更紧凑的方法。

我认为它更具可读性和易懂性。您可以参考如下:

这是你的var I delcare std::map<std::string, std::shared_ptr<void>>

response

让我们合并这些值。

response = [
    {'country': 'KR', 'values': ['Server1']},
    {'country': 'IE', 'values': ['Server1', 'Server3', 'Server2']},
    {'country': 'IE', 'values': ['Server1', 'Server3']},
    {'country': 'DE', 'values': ['Server1']},
    {'country': 'DE', 'values': ['Server2']},
]

如果您想了解其内容,可以打印new_res = {} for e in response: if e['country'] not in new_res: new_res[e['country']] = e['values'] else: new_res[e['country']].extend(e['values']) 。它如下所示:

new_res

调用{ 'KR': ['Server1'], 'DE': ['Server1', 'Server2'], 'IE': ['Server1', 'Server3', 'Server2', 'Server1', 'Server3'] } 模块收集元素:

collections

完成计算结果后,您可以打印from collections import Counter new_list = [] for country, values in new_res.items(): # elements are stored as dictionary keys and their counts are stored as dictionary values merge_values = Counter(values) # calculate percentage new_values = [] total = sum(merge_values.values()) for server_name, num in merge_values.items(): #ex: Server1-40.0 new_values.append("{0}-{1:.1f}".format(server_name, num*100/total)) percent = merge_values["Server1"]*1.0*100/total new_list.append({"country": country, "percent": percent, "values": new_values})

new_list

所以你可以得到你想要的答案。