我写了一个比较列表列表的循环。如果它找到两个相似的列表,它会汇总其中一个字符串并删除第二个列表。有没有办法让它更正确?
输入列表:
someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]
代码:
a = 0
for i in range(len(someList)):
for k in range(len(someList)):
if someList[i] != someList[k]:
if someList[i][0] == someList[k][0]:
if someList[i][1] == someList[k][1]:
if someList[i][4] == someList[k][4]:
someList[i][2] = someList[i][2] + someList[k][2]
someList[k][4] = 'lalala'
a = k
del someList[a]
所需的输出列表是:
someList = [['abc', 'def', 60, 'ghi'], ['jkl', 'mno', 20, 'pqr']]
这段代码有效,但我的写得非常糟糕。此外,如果列表中只有2个类似的子列表,则它可以工作。
答案 0 :(得分:0)
我用临时查找地图来做到这一点:
someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]
lookup_map = {}
for e1, e2, e3, e4 in someList:
key = (e1, e2, e4)
if key in lookup_map:
lookup_map[key][2] += e3
else:
lookup_map[key] = [e1, e2, e3, e4]
print(lookup_map.values())
# [['jkl', 'mno', 20, 'pqr'], ['abc', 'def', 60, 'ghi']]
在 O(N)时间结束。如果您需要保留订单,请使用collections.OrderedDict
作为lookup_map
。此外,如果只存储键并将第三个元素增加为如下值,则可以提高内存效率:
import collections
someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]
lookup_map = collections.defaultdict(int)
for e1, e2, e3, e4 in someList:
lookup_map[e1, e2, e4] += e3
# but now we need to unpack it back into a list:
result = [[e1, e2, v, e3] for (e1, e2, e3), v in lookup_map.items()]
# [['jkl', 'mno', 20, 'pqr'], ['abc', 'def', 60, 'ghi']]
或者,如果您不想使用collections.defaultdict
(您应该,对于较大的数据集,它会更快):
someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]
lookup_map = {}
for e1, e2, e3, e4 in someList:
key = (e1, e2, e4)
lookup_map[key] = lookup_map.get(key, 0) + e3
# but now we need to unpack it back to a list:
result = [[e1, e2, v, e3] for (e1, e2, e3), v in lookup_map.items()]
# [['jkl', 'mno', 20, 'pqr'], ['abc', 'def', 60, 'ghi']]