当前,我有一个字典,其键代表邮政编码,并且值也是一个字典。
d = { 94111: {'a': 5, 'b': 7, 'd': 7},
95413: {'a': 6, 'd': 4},
84131: {'a': 5, 'b': 15, 'c': 10, 'd': 11},
73173: {'a': 15, 'c': 10, 'd': 15},
80132: {'b': 7, 'c': 7, 'd': 7} }
然后是第二个字典,用于关联邮政编码所属的州。
states = {94111: "TX", 84131: "TX", 95413: "AL", 73173: "AL", 80132: "AL"}
如果字典states
中的邮政编码与db
中的键之一匹配,则它将汇总这些值并将其放入新字典中,如预期的输出一样。
预期输出:
{'TX': {'a': 10, 'b': 22, 'd': 18, 'c': 10}, 'AL': {'a': 21, 'd': 26, 'c': 17, 'b': 7}}
到目前为止,这是我要寻找的方向,但是我不确定两个键何时匹配,如何创建看起来像预期输出的字典。
def zips(d, states):
result = dict()
for key, value in db.items():
for keys, values in states.items():
if key == keys:
zips(d, states)
答案 0 :(得分:11)
使用collections
模块
例如:
from collections import defaultdict, Counter
d = { 94111: {'a': 5, 'b': 7, 'd': 7},
95413: {'a': 6, 'd': 4},
84131: {'a': 5, 'b': 15, 'c': 10, 'd': 11},
73173: {'a': 15, 'c': 10, 'd': 15},
80132: {'b': 7, 'c': 7, 'd': 7} }
states = {94111: "TX", 84131: "TX", 95413: "AL", 73173: "AL", 80132: "AL"}
result = defaultdict(Counter)
for k,v in d.items():
if k in states:
result[states[k]] += Counter(v)
print(result)
输出:
defaultdict(<class 'collections.Counter'>, {'AL': Counter({'d': 26, 'a': 21, 'c': 17, 'b': 7}),
'TX': Counter({'b': 22, 'd': 18, 'a': 10, 'c': 10})})
答案 1 :(得分:2)
您可以只使用defaultdict并循环计数:
expected_output = defaultdict(lambda: defaultdict(int))
for postcode, state in states.items():
for key, value in d.get(postcode, {}).items():
expected_output[state][key] += value
答案 2 :(得分:1)
这是对Rakesh答案的补充,这是更接近您的代码的答案:
res = {v:{} for v in states.values()}
for k,v in states.items():
if k in d:
sub_dict = d[k]
output_dict = res[v]
for sub_k,sub_v in sub_dict.items():
output_dict[sub_k] = output_dict.get(sub_k, 0) + sub_v
答案 3 :(得分:1)
您可以使用以下内容:
d = { 94111: {'a': 5, 'b': 7, 'd': 7},
95413: {'a': 6, 'd': 4},
84131: {'a': 5, 'b': 15, 'c': 10, 'd': 11},
73173: {'a': 15, 'c': 10, 'd': 15},
80132: {'b': 7, 'c': 7, 'd': 7} }
states = {94111: "TX", 84131: "TX", 95413: "AL", 73173: "AL", 80132: "AL"}
out = {i: 0 for i in states.values()}
for key, value in d.items():
if key in states:
if not out[states[key]]:
out[states[key]] = value
else:
for k, v in value.items():
if k in out[states[key]]:
out[states[key]][k] += v
else:
out[states[key]][k] = v
# out -> {'TX': {'a': 10, 'b': 22, 'd': 18, 'c': 10}, 'AL': {'a': 21, 'd': 26, 'c': 17, 'b': 7}}
答案 4 :(得分:1)
可以使用类email
进行计数对象:
Counter
您可以将from collections import Counter
d = { 94111: {'a': 5, 'b': 7, 'd': 7},
95413: {'a': 6, 'd': 4},
84131: {'a': 5, 'b': 15, 'c': 10, 'd': 11},
73173: {'a': 15, 'c': 10, 'd': 15},
80132: {'b': 7, 'c': 7, 'd': 7} }
states = {94111: "TX", 84131: "TX", 95413: "AL", 73173: "AL", 80132: "AL"}
new_d = {}
for k, v in d.items():
if k in states:
new_d.setdefault(states[k], Counter()).update(v)
print(new_d)
# {'TX': Counter({'b': 22, 'd': 18, 'a': 10, 'c': 10}), 'AL': Counter({'d': 26, 'a': 21, 'c': 17, 'b': 7})}
转换为词典字典:
new_d
答案 5 :(得分:1)
您可以利用dict
的{{1}}方法,该方法返回一个元组列表,并以简单的单行代码获得预期的输出:
.items()
输出:
new_dict = {value:d[key] for key, value in states.items()}
答案 6 :(得分:0)
您可能想重新考虑选择dict
来存储数据的方式。如果您使用熊猫存储数据,则聚合会容易得多。
df = pd.DataFrame(d).transpose()
df['states']=pd.Series(states)
df.groupby('states').sum()
>> a b c d
>>states
>>AL 21.0 7.0 17.0 26.0
>>TX 10.0 22.0 10.0 18.0