Question

我写了一个比较列表列表的循环。如果它找到两个相似的列表，它会汇总其中一个字符串并删除第二个列表。有没有办法让它更正确？

输入列表：

someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]

代码：

a = 0
for i in range(len(someList)):
    for k in range(len(someList)):
        if someList[i] != someList[k]:
            if someList[i][0] == someList[k][0]:
                if someList[i][1] == someList[k][1]:
                        if someList[i][4] == someList[k][4]:
                            someList[i][2] = someList[i][2] + someList[k][2]
                            someList[k][4] = 'lalala'
                            a = k
del someList[a]

所需的输出列表是：

someList = [['abc', 'def', 60, 'ghi'], ['jkl', 'mno', 20, 'pqr']]

这段代码有效，但我的写得非常糟糕。此外，如果列表中只有2个类似的子列表，则它可以工作。

Answer 1

我用临时查找地图来做到这一点：

someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]

lookup_map = {}
for e1, e2, e3, e4 in someList:
    key = (e1, e2, e4)
    if key in lookup_map:
        lookup_map[key][2] += e3
    else:
        lookup_map[key] = [e1, e2, e3, e4]

print(lookup_map.values())
# [['jkl', 'mno', 20, 'pqr'], ['abc', 'def', 60, 'ghi']]

在 O（N）时间结束。如果您需要保留订单，请使用collections.OrderedDict作为lookup_map。此外，如果只存储键并将第三个元素增加为如下值，则可以提高内存效率：

import collections

someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]

lookup_map = collections.defaultdict(int)
for e1, e2, e3, e4 in someList:
    lookup_map[e1, e2, e4] += e3

# but now we need to unpack it back into a list:
result = [[e1, e2, v, e3] for (e1, e2, e3), v in lookup_map.items()]
# [['jkl', 'mno', 20, 'pqr'], ['abc', 'def', 60, 'ghi']]

或者，如果您不想使用collections.defaultdict（您应该，对于较大的数据集，它会更快）：

someList = [['abc', 'def', 10, 'ghi'], ['abc', 'def', 50, 'ghi'], ['jkl', 'mno', 20, 'pqr']]

lookup_map = {}
for e1, e2, e3, e4 in someList:
    key = (e1, e2, e4)
    lookup_map[key] = lookup_map.get(key, 0) + e3

# but now we need to unpack it back to a list:
result = [[e1, e2, v, e3] for (e1, e2, e3), v in lookup_map.items()]
# [['jkl', 'mno', 20, 'pqr'], ['abc', 'def', 60, 'ghi']]

循环用于连接列表中的类似列表

1 个答案: