假设我有以下词典:
{name: "john", place: "nyc", owns: "gold", quantity: 30}
{name: "john", place: "nyc", owns: "silver", quantity: 20}
{name: "jane", place: "nyc", owns: "platinum", quantity: 5}
{name: "john", place: "chicago", owns: "brass", quantity: 60}
{name: "john", place: "chicago", owns: "silver", quantity: 40}
我有数百本这些小词典。我必须将它们与公共密钥的子集合并,在此示例中(名称,位置)并创建新的字典。最终,输出应如下所示:
{name: "john", place: "nyc", gold: 30, silver: 20}
{name: "jane", place: "nyc", platinum: 5}
{name: "john", place: "chicago", brass: 60, silver: 40}
有没有有效的方法呢?我能想到的只有蛮力,我将跟踪每个可能的名称 - 地点组合,存储在一些列表中,为每个组合再次遍历整个事物并将字典合并为一个新字典。谢谢!
答案 0 :(得分:7)
首先,获取您要求的输出:
data = [{'name': "john", 'place': "nyc", 'owns': "gold", 'quantity': 30},
{'name': "john", 'place': "nyc", 'owns': "silver", 'quantity': 20},
{'name': "jane", 'place': "nyc", 'owns': "platinum", 'quantity': 5},
{'name': "john", 'place': "chicago", 'owns': "brass", 'quantity': 60},
{'name': "john", 'place': "chicago", 'owns': "silver", 'quantity': 40}]
from collections import defaultdict
accumulator = defaultdict(list)
for p in data:
accumulator[p['name'],p['place']].append((p['owns'],p['quantity']))
from itertools import chain
[dict(chain([('name',name), ('place',place)], rest)) for (name,place),rest in accumulator.iteritems()]
Out[13]:
[{'name': 'jane', 'place': 'nyc', 'platinum': 5},
{'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40},
{'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20}]
现在我 指出你要求的这个dicts of the dicts数据结构非常尴尬。 Dicts非常适合查找,但是当你只需要将一个用于整组对象时它们表现最佳 - 如果你必须线性搜索一堆dicts来查找你想要的那个,你立刻失去了dict
提供的全部好处。所以这给我们留下了几个选择。深入一级 - 在dict
中嵌套dict
,或完全使用其他内容。
我可以建议列出一些有意义的对象,每个对象代表其中一个人吗?要么创建自己的class
,要么使用namedtuple
:
from collections import namedtuple
Person = namedtuple('Person','name place holdings')
[Person(name, place, dict(rest)) for (name,place), rest in accumulator.iteritems()]
Out[17]:
[Person(name='jane', place='nyc', holdings={'platinum': 5}),
Person(name='john', place='chicago', holdings={'brass': 60, 'silver': 40}),
Person(name='john', place='nyc', holdings={'silver': 20, 'gold': 30})]
答案 1 :(得分:1)
因此,我的个人策略概述如下。您应该在给定dict实例的情况下定义密钥生成器,然后通过生成的密钥将其分组到隔离的dict中。在您遍历所有元素并根据密钥更新后,只需返回分组字典的.values()
。
dicts = [
{"name": "john", "place": "nyc", "owns": "gold", "quantity": 30},
{"name": "john", "place": "nyc", "owns": "silver", "quantity": 20},
{"name": "jane", "place": "nyc", "owns": "platinum", "quantity": 5},
{"name": "john", "place": "chicago", "owns": "brass", "quantity": 60},
{"name": "john", "place": "chicago", "owns": "silver", "quantity": 40}
]
def get_key(instance):
return "%s-%s" % (instance.get("name"), instance.get("place"), )
grouped = {}
for dict_ in dicts:
grouped[get_key(dict_)] = grouped.get(get_key(dict_), {})
grouped[get_key(dict_)].update(dict_)
print grouped.values()
# [
# {'owns': 'platinum', 'place': 'nyc', 'name': 'jane', 'quantity': 5},
# {'name': 'john', 'place': 'nyc', 'owns': 'silver', 'quantity': 20},
# {'name': 'john', 'place': 'chicago', 'owns': 'silver', 'quantity': 40}
# ]
答案 2 :(得分:0)
可能是一个疯狂的想法,但是如何直截了当地说明了这个词?这将像2D数组一样工作,行和列索引是名称和位置。
my_dicts = [
{"name": "john", "place": "nyc", "owns": "gold", "quantity": 30},
{"name": "john", "place": "nyc", "owns": "silver", "quantity": 20},
{"name": "jane", "place": "nyc", "owns": "platinum", "quantity": 5},
{"name": "john", "place": "chicago", "owns": "brass", "quantity": 60},
{"name": "john", "place": "chicago", "owns": "silver", "quantity": 40}
]
all_names = set(d["name"] for d in my_dicts)
all_places = set(d["place"] for d in my_dicts)
merged = {name : {place : {} for place in all_places} for name in all_names}
for d in my_dicts:
merged[d["name"]][d["place"]][d["owns"]] = d["quantity"]
import pprint
pprint.pprint(merged)
# {'jane': {'chicago': {}, 'nyc': {'platinum': 5}},
# 'john': {'chicago': {'brass': 60, 'silver': 40},
# 'nyc': {'gold': 30, 'silver': 20}}}
然后转换为您想要的格式:
new_dicts = [{"name" : name, "place" : place} for name in all_names for place in all_places if merged[name][place]]
for d in new_dicts:
d.update(merged[d["name"]][d["place"]])
pprint.pprint(new_dicts)
# [{'name': 'jane', 'place': 'nyc', 'platinum': 5},
# {'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20},
# {'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40}]
答案 3 :(得分:0)
from itertools import groupby
result, get_owns = [], lambda x: x["owns"]
get_details = lambda x: (x["name"], x["place"])
# Sort and group the data based on name and place
for key, grp in groupby(sorted(data, key=get_details), key=get_details):
# Create a dictionary with the name and place
temp = dict(zip(("name", "place"), key))
# Sort and group the grouped data based on owns
for owns, grp1 in groupby(sorted(grp, key=get_owns), key=get_owns):
# For each material, find and add the sum of quantity in temp
temp[owns] = sum(item["quantity"] for item in grp1)
# Add the temp dictionary to the result :-)
result.append(temp)
print result
<强>输出强>
[{'name': 'jane', 'place': 'nyc', 'platinum': 5},
{'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40},
{'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20}]
答案 4 :(得分:0)
这是一种方法:
dicts = [
{"name": "john", "place": "nyc", "owns": "gold", "quantity": 30},
{"name": "john", "place": "nyc", "owns": "silver", "quantity": 20},
{"name": "jane", "place": "nyc", "owns": "platinum", "quantity": 5},
{"name": "john", "place": "chicago", "owns": "brass", "quantity": 60},
{"name": "john", "place": "chicago", "owns": "silver", "quantity": 40}
]
我们创建一个转换后的dict,其中place-name
为键,输出dict为值
transformed_dict = {}
for a_dict in dicts:
key = '{}-{}'.format(a_dict['place'], a_dict['name'])
if key not in transformed_dict:
transformed_dict[key] = {'name': a_dict['name'], 'place': a_dict['place'], a_dict['owns']: a_dict['quantity']}
else:
transformed_dict[key][a_dict['owns']] = a_dict['quantity']
transformed_dict
现在看起来像:
{'chicago-john': {'brass': 60,
'name': 'john',
'place': 'chicago',
'silver': 40},
'nyc-jane': {'name': 'jane', 'place': 'nyc', 'platinum': 5},
'nyc-john': {'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20}}
pprint(list(transformed_dict.values()))
给出了我们想要的东西:
[{'gold': 30, 'name': 'john', 'place': 'nyc', 'silver': 20},
{'brass': 60, 'name': 'john', 'place': 'chicago', 'silver': 40},
{'name': 'jane', 'place': 'nyc', 'platinum': 5}]