我有一个有序字典和一个int数组,它们的索引大小都相同:
Dict = {"A": "Apple", "A": "Ant", "A": "Apple", "B": "Ball", "B": "Beach", "C": "Cat", "C": "Cat", "D": "Ball"}
Arr = [1, 2, 3, 4, 5, 6, 7, 8]
我想删除字典中的重复项(非唯一键/值对),并同样对数组中的整数求和:
New Dict = {"A": "Apple", "A": "Ant", "B": "Ball", "B": "Beach", "C": "Cat", "D": "Ball"}
New Arr = [4, 2, 4, 5, 13, 8]
对解决此问题的优雅方法有何建议?
答案 0 :(得分:0)
您不能有这样的字典(请参阅注释),但是可以有一个元组列表:
>>> ts = [("A", "Apple"), ("A", "Ant"), ("A", "Apple"), ("B", "Ball"), ("B", "Beach"), ("C", "Cat"), ("C", "Cat"), ("D", "Ball")]
>>> arr = [1, 2, 3, 4, 5, 6, 7, 8]
您可以将元组与数组压缩以将元组与其编号相关联:
>>> sorted(zip(ts, arr))
[(('A', 'Ant'), 2), (('A', 'Apple'), 1), (('A', 'Apple'), 3), (('B', 'Ball'), 4), (('B', 'Beach'), 5), (('C', 'Cat'), 6), (('C', 'Cat'), 7), (('D', 'Ball'), 8)]
来自groupby
模块的函数itertools
在排序列表中非常方便。 lambda x: x[0]
表示(tuple, number)
被分组在tuple
上:
>>> import itertools
>>> [(t, list(g)) for t, g in itertools.groupby(sorted(zip(ts, arr)), lambda x: x[0])]
[(('A', 'Ant'), [(('A', 'Ant'), 2)]), (('A', 'Apple'), [(('A', 'Apple'), 1), (('A', 'Apple'), 3)]), (('B', 'Ball'), [(('B', 'Ball'), 4)]), (('B', 'Beach'), [(('B', 'Beach'), 5)]), (('C', 'Cat'), [(('C', 'Cat'), 6), (('C', 'Cat'), 7)]), (('D', 'Ball'), [(('D', 'Ball'), 8)])]
但是您不想要列表,而是想要每个sum
组的数字n
中的g
:
>>> L = [(t, sum(n for _, n in g)) for t, g in itertools.groupby(sorted(zip(ts, arr)), lambda x: x[0])]
>>> L
[(('A', 'Ant'), 2), (('A', 'Apple'), 4), (('B', 'Ball'), 4), (('B', 'Beach'), 5), (('C', 'Cat'), 13), (('D', 'Ball'), 8)]
如果要返回初始格式:
>>> list(zip(*L))
[(('A', 'Ant'), ('A', 'Apple'), ('B', 'Ball'), ('B', 'Beach'), ('C', 'Cat'), ('D', 'Ball')), (2, 4, 4, 5, 13, 8)]
>>> ts, ns = list(zip(*L))
>>> ts
(('A', 'Ant'), ('A', 'Apple'), ('B', 'Ball'), ('B', 'Beach'), ('C', 'Cat'), ('D', 'Ball'))
>>> ns
(2, 4, 4, 5, 13, 8)