使用公共密钥合并python词典

时间:2014-04-30 03:33:16

标签: python python-2.7 dictionary

假设我有以下词典:

{name: "john", place: "nyc", owns: "gold", quantity: 30}
{name: "john", place: "nyc", owns: "silver", quantity: 20}
{name: "jane", place: "nyc", owns: "platinum", quantity: 5}
{name: "john", place: "chicago", owns: "brass", quantity: 60}
{name: "john", place: "chicago", owns: "silver", quantity: 40}

我有数百本这些小词典。我必须将它们与公共密钥的子集合并,在此示例中(名称,位置)并创建新的字典。最终,输出应如下所示:

{name: "john", place: "nyc", gold: 30, silver: 20}
{name: "jane", place: "nyc", platinum: 5}
{name: "john", place: "chicago", brass: 60, silver: 40}

有没有有效的方法呢?我能想到的只有蛮力,我将跟踪每个可能的名称 - 地点组合,存储在一些列表中,为每个组合再次遍历整个事物并将字典合并为一个新字典。谢谢!

5 个答案:

答案 0 :(得分:7)

首先,获取您要求的输出:

data = [{'name': "john", 'place': "nyc", 'owns': "gold", 'quantity': 30},
{'name': "john", 'place': "nyc", 'owns': "silver", 'quantity': 20},
{'name': "jane", 'place': "nyc", 'owns': "platinum", 'quantity': 5},
{'name': "john", 'place': "chicago", 'owns': "brass", 'quantity': 60},
{'name': "john", 'place': "chicago", 'owns': "silver", 'quantity': 40}]

from collections import defaultdict

accumulator = defaultdict(list)

for p in data:
    accumulator[p['name'],p['place']].append((p['owns'],p['quantity']))

from itertools import chain

[dict(chain([('name',name), ('place',place)], rest)) for (name,place),rest in accumulator.iteritems()]
Out[13]: 
[{'name': 'jane', 'place': 'nyc', 'platinum': 5},
 {'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40},
 {'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20}]

现在我 指出你要求的这个dicts of the dicts数据结构非常尴尬。 Dicts非常适合查找,但是当你只需要将一个用于整组对象时它们表现最佳 - 如果你必须线性搜索一堆dicts来查找你想要的那个,你立刻失去了dict提供的全部好处。所以这给我们留下了几个选择。深入一级 - 在dict中嵌套dict,或完全使用其他内容。

我可以建议列出一些有意义的对象,每个对象代表其中一个人吗?要么创建自己的class,要么使用namedtuple

from collections import namedtuple

Person = namedtuple('Person','name place holdings')

[Person(name, place, dict(rest)) for (name,place), rest in accumulator.iteritems()]
Out[17]: 
[Person(name='jane', place='nyc', holdings={'platinum': 5}),
 Person(name='john', place='chicago', holdings={'brass': 60, 'silver': 40}),
 Person(name='john', place='nyc', holdings={'silver': 20, 'gold': 30})]

答案 1 :(得分:1)

因此,我的个人策略概述如下。您应该在给定dict实例的情况下定义密钥生成器,然后通过生成的密钥将其分组到隔离的dict中。在您遍历所有元素并根据密钥更新后,只需返回分组字典的.values()

dicts = [
    {"name": "john", "place": "nyc", "owns": "gold", "quantity": 30},
    {"name": "john", "place": "nyc", "owns": "silver", "quantity": 20},
    {"name": "jane", "place": "nyc", "owns": "platinum", "quantity": 5},
    {"name": "john", "place": "chicago", "owns": "brass", "quantity": 60},
    {"name": "john", "place": "chicago", "owns": "silver", "quantity": 40}
]

def get_key(instance):
    return "%s-%s" % (instance.get("name"), instance.get("place"), )

grouped = {}

for dict_ in dicts:
    grouped[get_key(dict_)] = grouped.get(get_key(dict_), {})
    grouped[get_key(dict_)].update(dict_)

print grouped.values()
# [
#   {'owns': 'platinum', 'place': 'nyc', 'name': 'jane', 'quantity': 5},
#   {'name': 'john', 'place': 'nyc', 'owns': 'silver', 'quantity': 20}, 
#   {'name': 'john', 'place': 'chicago', 'owns': 'silver', 'quantity': 40}
# ]

答案 2 :(得分:0)

可能是一个疯狂的想法,但是如何直截了当地说明了这个词?这将像2D数组一样工作,行和列索引是名称和位置。

my_dicts = [
    {"name": "john", "place": "nyc", "owns": "gold", "quantity": 30},
    {"name": "john", "place": "nyc", "owns": "silver", "quantity": 20},
    {"name": "jane", "place": "nyc", "owns": "platinum", "quantity": 5},
    {"name": "john", "place": "chicago", "owns": "brass", "quantity": 60},
    {"name": "john", "place": "chicago", "owns": "silver", "quantity": 40}
]

all_names = set(d["name"] for d in my_dicts)
all_places = set(d["place"] for d in my_dicts)

merged = {name : {place : {} for place in all_places} for name in all_names}

for d in my_dicts:
    merged[d["name"]][d["place"]][d["owns"]] = d["quantity"]

import pprint
pprint.pprint(merged)

# {'jane': {'chicago': {}, 'nyc': {'platinum': 5}},
#  'john': {'chicago': {'brass': 60, 'silver': 40},
#           'nyc': {'gold': 30, 'silver': 20}}}

然后转换为您想要的格式:

new_dicts = [{"name" : name, "place" : place} for name in all_names for place in all_places if merged[name][place]]
for d in new_dicts:
    d.update(merged[d["name"]][d["place"]])
pprint.pprint(new_dicts)

# [{'name': 'jane', 'place': 'nyc', 'platinum': 5},
#  {'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20},
#  {'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40}]

答案 3 :(得分:0)

from itertools import groupby
result, get_owns = [], lambda x: x["owns"]
get_details =  lambda x: (x["name"], x["place"])

# Sort and group the data based on name and place
for key, grp in groupby(sorted(data, key=get_details), key=get_details):

    # Create a dictionary with the name and place
    temp = dict(zip(("name", "place"), key))

    # Sort and group the grouped data based on owns
    for owns, grp1 in groupby(sorted(grp, key=get_owns), key=get_owns):

        # For each material, find and add the sum of quantity in temp
        temp[owns] = sum(item["quantity"] for item in grp1)

    # Add the temp dictionary to the result :-)
    result.append(temp)
print result

<强>输出

[{'name': 'jane', 'place': 'nyc', 'platinum': 5},
 {'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40},
 {'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20}]

答案 4 :(得分:0)

这是一种方法:

dicts = [
    {"name": "john", "place": "nyc", "owns": "gold", "quantity": 30},
    {"name": "john", "place": "nyc", "owns": "silver", "quantity": 20},
    {"name": "jane", "place": "nyc", "owns": "platinum", "quantity": 5},
    {"name": "john", "place": "chicago", "owns": "brass", "quantity": 60},
    {"name": "john", "place": "chicago", "owns": "silver", "quantity": 40}
]

我们创建一个转换后的dict,其中place-name为键,输出dict为值

transformed_dict = {}
for a_dict in dicts:
    key = '{}-{}'.format(a_dict['place'], a_dict['name'])
    if key not in transformed_dict:
        transformed_dict[key] = {'name': a_dict['name'], 'place': a_dict['place'], a_dict['owns']: a_dict['quantity']}
    else:
        transformed_dict[key][a_dict['owns']] = a_dict['quantity']

transformed_dict现在看起来像:

{'chicago-john': {'brass': 60,
                  'name': 'john',
                  'place': 'chicago',
                  'silver': 40},
 'nyc-jane': {'name': 'jane', 'place': 'nyc', 'platinum': 5},
 'nyc-john': {'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20}}

pprint(list(transformed_dict.values()))给出了我们想要的东西:

[{'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20},
 {'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40},
 {'name': 'jane', 'place': 'nyc', 'platinum': 5}]