以元组为键的Python字典列表

时间:2018-10-31 12:38:39

标签: python list dictionary

我有这样的字典

Counter({('know', 'you'): 1053, ('know', 'i'): 847, ('il', 'i'): 784, 
         ('want', 'to'): 680, ('want', 'you'): 561, ('il', 'you'): 561, 
         ('come', 'on'): 557, ('know', 't'): 499, ('go', 'to'): 447, 
         ('right', 'all'): 440, ('want', 'i'): 430, ('know', 'don'): 410, 
         ('get', 'to'): 409, ('like', 'you'): 397, ('like', 'i'): 338, 
         ('get', 'you'): 336, ('il', 'be'): 330})

我想创建一个字典列表,其中每个字典中只有具有相同第一个元素的元组,像这样

[{('know', 'you'): 1053, ('know', 'i'): 847, ('know', 't'): 499,('know', 'don'): 410}, 
 {('want', 'to'): 680, ('want', 'you'): 561, ('want', 'i'): 430},  
 {('get', 'to'): 409, ('get', 'you'): 336}, 
 {('like', 'you'): 397, ('like', 'i'): 338}]

之后,我想将每个字典中的值存储在嵌套列表中并创建一个数组。嵌套列表看起来像这样

[[1053, 847, 499, 410], [680, 561, 430], [409, 336], [397, 338]]

您有一些想法我该怎么做?

编辑:经过一些评论,我意识到元组中的第二个元素也必须与其他元素相对应。因此,词典列表实际上应如下所示:

[{('know', 'you'): 1053, ('know', 'i'): 847, ('know', 'to'): 499}
 {('want', 'you'): 5, ('want', 'i'): 430},  ('want', 'to'): 680}
 {('get', 'you'): 3, ('get', ‚i'): 68, ('get', 'to'): 409}
 {('like', 'you'): 397, ('like', 'i'): 338}, ('like', 'to'): 345}]

2 个答案:

答案 0 :(得分:3)

您是否需要词典的中间列表?您可以轻松地从输入字典中直接实现目标。

from collections import defaultdict

out = defaultdict(list)

for k, v in input_dict.items():
    out[k[0]].append(v)

print(out)
# defaultdict(<class 'list'>, {'know': [1053, 847, 499, 410], 'il': [784, 561, 330], 
#                              'want': [680, 561, 430], 'come': [557], 'go': [447],
#             '                'right': [440], 'get': [409, 336], 'like': [397, 338]})

然后,如果您坚持使用嵌套列表:

print([v for v in out.values()])
# [[1053, 847, 499, 410], [784, 561, 330], [680, 561, 430], [557], [447], [440],
#  [409, 336], [397, 338]]

答案 1 :(得分:0)

要获得最终结果,您可以使用setdefault

data = {('know', 'you'): 1053, ('know', 'i'): 847, ('il', 'i'): 784, ('want', 'to'): 680, ('want', 'you'): 561,
        ('il', 'you'): 561, ('come', 'on'): 557, ('know', 't'): 499, ('go', 'to'): 447, ('right', 'all'): 440,
        ('want', 'i'): 430, ('know', 'don'): 410, ('get', 'to'): 409, ('like', 'you'): 397, ('like', 'i'): 338,
        ('get', 'you'): 336, ('il', 'be'): 330}


result = {}
for k, v in data.items():
    result.setdefault(k[0], []).append(v)

print([e for e in result.values()])

输出

[[561, 680, 430], [447], [397, 338], [440], [847, 1053, 499, 410], [336, 409], [784, 561, 330], [557]]

如果出于某种原因需要中间表示,则可以执行以下操作:

from itertools import groupby

data = {('know', 'you'): 1053, ('know', 'i'): 847, ('il', 'i'): 784, ('want', 'to'): 680, ('want', 'you'): 561,
        ('il', 'you'): 561, ('come', 'on'): 557, ('know', 't'): 499, ('go', 'to'): 447, ('right', 'all'): 440,
        ('want', 'i'): 430, ('know', 'don'): 410, ('get', 'to'): 409, ('like', 'you'): 397, ('like', 'i'): 338,
        ('get', 'you'): 336, ('il', 'be'): 330}


result = [dict(group) for _, group in groupby(sorted(data.items()), key=lambda x: x[0][0])]    
    print(result)

输出中间表示

[{('come', 'on'): 557}, {('get', 'to'): 409, ('get', 'you'): 336}, {('go', 'to'): 447}, {('il', 'i'): 784, ('il', 'be'): 330, ('il', 'you'): 561}, {('know', 'i'): 847, ('know', 't'): 499, ('know', 'you'): 1053, ('know', 'don'): 410}, {('like', 'i'): 338, ('like', 'you'): 397}, {('right', 'all'): 440}, {('want', 'i'): 430, ('want', 'you'): 561, ('want', 'to'): 680}]