比较Python嵌套列表和计算重复项

时间:2012-09-25 17:55:07

标签: python list count nested duplicates

我有两个带字符串的嵌套列表(list_alist_b),详情如下:

list_a = [
('shop1', 'stand1', 'shelf1', 'fruit1'),
('shop1', 'stand1', 'shelf2', 'fruit2'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf1', 'fruit1'),
('shop2', 'stand3', 'shelf2', 'fruit2'),
('shop2', 'stand3', 'shelf3', 'fruit3')
]
list_b = [
('shop1', 'stand1', 'shelf1', 'fruit1'),
('shop1', 'stand1', 'shelf2', 'fruit2'),
('shop1', 'stand1', 'shelf2', 'fruit2'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand1', 'shelf3', 'fruit3'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf1', 'fruit1'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf2', 'fruit2'),
('shop1', 'stand2', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf1', 'fruit1'),
('shop2', 'stand3', 'shelf1', 'fruit1'),
('shop2', 'stand3', 'shelf2', 'fruit2'),
('shop2', 'stand3', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf3', 'fruit3'),
('shop2', 'stand3', 'shelf3', 'fruit3')
]

我希望在list_b中找到来自list_a的相同行,计算“重复”行,并将list_a与另外一列(出现次数)合并为新列表,如下所示:

result_list = [
('shop1', 'stand1', 'shelf1', 'fruit1', 1),
('shop1', 'stand1', 'shelf2', 'fruit2', 2),
('shop1', 'stand1', 'shelf3', 'fruit3', 3),
('shop1', 'stand2', 'shelf1', 'fruit1', 3),
('shop1', 'stand2', 'shelf2', 'fruit2', 3),
('shop1', 'stand2', 'shelf3', 'fruit3', 1),
('shop2', 'stand3', 'shelf1', 'fruit1', 2),
('shop2', 'stand3', 'shelf2', 'fruit2', 1),
('shop2', 'stand3', 'shelf3', 'fruit3', 3)
]

有没有快速有效的方法来做到这一点?

3 个答案:

答案 0 :(得分:1)

使用Counter()

    >>> from collections import Counter
    >>> count=Counter(list_b)
    >>> [list(x)+[count[x]] for x in list_a]

    [['shop1', 'stand1', 'shelf1', 'fruit1', 1], 
    ['shop1', 'stand1', 'shelf2', 'fruit2', 2],
    ['shop1', 'stand1', 'shelf3', 'fruit3', 3],
    ['shop1', 'stand2', 'shelf1', 'fruit1', 3],
    ['shop1', 'stand2', 'shelf2', 'fruit2', 3],
    ['shop1', 'stand2', 'shelf3', 'fruit3', 1],
    ['shop2', 'stand3', 'shelf1', 'fruit1', 2], 
    ['shop2', 'stand3', 'shelf2', 'fruit2', 1], 
    ['shop2', 'stand3', 'shelf3', 'fruit3', 3]]`

答案 1 :(得分:1)

dict_a = {row: 0 for row in list_a}
for row in list_b:
    if row in dict_a:
        dict_a[row] += 1

result = [row + (dict_a[row],) for row in list_a]

在Python 2.6上使用dict((row, 0) for row in list_a)而不是词典理解。

答案 2 :(得分:0)

这些不是嵌套列表,而是元组。这实际上是你的储蓄。见 Most Efficient way to calculate Frequency of values in a Python list? 哪些应该立即起作用。要获取重复项,请获取两个词典中的keys(),并计算它们的差异。