如何在Python字典中重命名相同的值?

时间:2017-10-07 13:56:34

标签: python dictionary

我有一个字典,我想重命名相似的值,从这样的东西:

{
    33: [3, 4, 6],
    34: [3, 4, 6],
    35: [3, 4, 6],
    99: [7, 8],
    100: [7, 8],
    124: [0, 1, 2, 5],
    125: [0, 1, 2, 5],
    126: [0, 1, 2, 5],
    127: [0, 1, 2, 5]
}

我需要去:

{
    33: Cluster1,
    34: Cluster1,
    35: Cluster1,
    99: Cluster2,
    100: Cluster2,
    124: Cluster3,
    125: Cluster3,
    126: Cluster3,
    127: Cluster3
}

任何提示都将受到赞赏。

4 个答案:

答案 0 :(得分:0)

您可以使用defaultdict并对初始词典进行排序:

from collections import defaultdict
d = defaultdict(str)
counter = 1
s = {33: [3, 4, 6],
34: [3, 4, 6],
35: [3, 4, 6],
99: [7, 8],
100: [7, 8],
124: [0, 1, 2, 5],
125: [0, 1, 2, 5],
126: [0, 1, 2, 5],
127: [0, 1, 2, 5]}
new_s = sorted(s.items(), key=lambda x:x[0])
d1 = new_s[0][-1]
for a, b in new_s:
   if b == d1:
      d[a] = "Cluster{}".format(counter)
      d1 = b
   else:
      d[a] = "Cluster{}".format(counter+1)
      d1 = b
      counter += 1

for a, b in sorted(d.items(), key=lambda x: x[0]):
   print(a, b)

输出:

(33, 'Cluster1')
(34, 'Cluster1')
(35, 'Cluster1')
(99, 'Cluster2')
(100, 'Cluster2')
(124, 'Cluster3')
(125, 'Cluster3')
(126, 'Cluster3')
(127, 'Cluster3')

编辑: 更强大的解决方案:

s = {9: [49, 50, 51], 10: [49, 50, 51], 11: [49, 50, 51], 13: [13, 14, 15, 16, 17], 18: [28, 29], 21: [38, 39, 40], 22: [38, 39, 40], 23: [38, 39, 40], 25: [28, 29], 33: [4, 5, 6], 34: [4, 5, 6], 35: [4, 5, 6], 36: [24, 25], 37: [24, 25], 40: [13, 14, 15, 16, 17]}

import itertools
final_dict = {}
for i, a in enumerate([(a, list(b)) for a, b in itertools.groupby(sorted(s.items(), key=lambda x:x[-1]), key=lambda x:x[-1])]):
   for a1, b1 in a[-1]:
      final_dict[a1] = "Cluster{}".format(i+1)

for a, b in sorted(final_dict.items(), key=lambda x:x[0]):
    print(a, b)

输出:

(9, 'Cluster6')
(10, 'Cluster6')
(11, 'Cluster6')
(13, 'Cluster2')
(18, 'Cluster4')
(21, 'Cluster5')
(22, 'Cluster5')
(23, 'Cluster5')
(25, 'Cluster4')
(33, 'Cluster1') 
(34, 'Cluster1')
(35, 'Cluster1')
(36, 'Cluster3')
(37, 'Cluster3')
(40, 'Cluster2')

答案 1 :(得分:0)

有一点必须注意:列表不可清除, 请检查一下:

my_dict = {33: [3, 4, 6],
           34: [3, 4, 6],
           35: [3, 4, 6],
           99: [7, 8],
           100: [7, 8],
           124: [0, 1, 2, 5],
           125: [0, 1, 2, 5],
           126: [0, 1, 2, 5],
           127: [0, 1, 2, 5]}

# the value of key 33, 34 are different
print(id(my_dict[33]))
print(id(my_dict[34]))


def to_hash_str(my_list):
    from hashlib import sha256
    import json
    return sha256(json.dumps(my_list).encode('utf-8')).hexdigest()


clusters_mapping = {to_hash_str(v): v for v in my_dict.values()}

print(clusters_mapping)

new_dict = {k: clusters_mapping[to_hash_str(v)] for k, v in my_dict.items()}

print(new_dict)

# the value of key 33, 34 are same
print(id(new_dict[33]))
print(id(new_dict[34]))

答案 2 :(得分:0)

一种方法是:

  • 创建一个空的seen
  • 创建一个列表,以保存您看到的值seen
  • 循环在原始字典上,如果您遇到了尚未见过的项目,请将其添加到已查看的字样中,否则继续执行下一步
  • 使用原始密钥将当前项目附加到新dictonary,值为seen = [] dct = {} for k in d: if d[k] not in seen: seen.append(d[k]) dct[k] = "Cluster{}".format(seen.index(d[k])+1)

    中项目的索引

    以下是代码:

    {{1}}

    经过测试并适合您的情况。

  • 答案 3 :(得分:0)

    d = <your dict>
    set_dict = list(enumerate(set(tuple(i) for i in d.values()), 1)) 
    { k: 'Cluster' + str([i for i,j in set_dict if list(j) == v][0]) for k,v in d.items() }
    

    输出:

    {33: 'Cluster1',
     34: 'Cluster1',
     35: 'Cluster1',
     99: 'Cluster3',
     100: 'Cluster3',
     124: 'Cluster2',
     125: 'Cluster2',
     126: 'Cluster2',
     127: 'Cluster2'}