如何删除包含列表的字典列表中的重复项?

时间:2019-04-30 08:16:35

标签: arrays python-3.x dictionary duplicates

我有一个字典列表,每个字典本身都有一个列表:

    [{'author': 'Stephen King', 'books': ['The stand', 'The 
    Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A 
    Moveable Feast', 'The sun Also Rises']},{'author': 'Stephen 
    King', 'books': ['The stand', 'The Outsider']}]

我已经尝试了大多数方法来删除字典列表中的重复项,但由于字典中的数组,到目前为止,它们似乎无法正常工作。

目的是删除字典列表中的重复项,其中每个字典本身都有一个列表

以上数据的预期输出应为:

    [{'author': 'Stephen King', 'books': ['The stand', 'The 
    Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A 
    Moveable Feast', 'The sun Also Rises']}]

3 个答案:

答案 0 :(得分:0)

这是一种方法。

例如:

data = [{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']},{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

checkVal = set()
result = []
for item in data:
    if item["author"] not in checkVal:   #Check if author & books in checkVal 
        result.append(item)              #Append result.
        checkVal.add(item["author"])     #Add author & books to checkVal 
print(result)

输出:

[{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']},
 {'author': 'Ernest Hemingway',
  'books': ['A Moveable Feast', 'The sun Also Rises']}]

根据评论进行编辑-选中authorbooks

checkVal = set()
result = []
for item in data:
    c = tuple(item["books"] + [item["author"]])
    if c not in checkVal:   #Check if author in checkVal 
        result.append(item)              #Append result.
        checkVal.add(c)     #Add author to checkVal 
pprint(result)

答案 1 :(得分:0)

dicts = [{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']},{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

def remove(dicts):
    for i in range(len(dicts)):
        if dicts[i] in dicts[i+1:]:
            dicts.remove(dicts[i])
            return remove(dicts)
        else:
            return dicts

print (remove(dicts))

输出:

[{'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']}, {'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

答案 2 :(得分:0)

您应该编写一些代码,以将您格式的字典转换为可哈希对象。然后,正常的重复数据删除代码(使用set)将起作用:

data = [{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']},
        {'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']},
        {'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

seen = set()
result = []
for dct in data:
    t = (dct['author'], tuple(dct['books'])) # transform into something hashable
    if t not in seen:
        seen.add(t)
        result.append(dct)

此代码假定您的词典仅具有键'author''books',而没有其他键。如果您想变得更通用,也支持其他键和值,则可以扩展逻辑。这是t的替代计算,它将支持任意键(只要它们都是可比较的)和值中的任意数量的列表:

t = tuple((k, tuple(v) if insinstance(v, list) else v) for k, v in sorted(dct.items())