Question

我有一个字典列表，每个字典本身都有一个列表：

    [{'author': 'Stephen King', 'books': ['The stand', 'The 
    Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A 
    Moveable Feast', 'The sun Also Rises']},{'author': 'Stephen 
    King', 'books': ['The stand', 'The Outsider']}]

我已经尝试了大多数方法来删除字典列表中的重复项，但由于字典中的数组，到目前为止，它们似乎无法正常工作。

目的是删除字典列表中的重复项，其中每个字典本身都有一个列表

以上数据的预期输出应为：

    [{'author': 'Stephen King', 'books': ['The stand', 'The 
    Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A 
    Moveable Feast', 'The sun Also Rises']}]

Answer 1

这是一种方法。

例如：

data = [{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']},{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

checkVal = set()
result = []
for item in data:
    if item["author"] not in checkVal:   #Check if author & books in checkVal 
        result.append(item)              #Append result.
        checkVal.add(item["author"])     #Add author & books to checkVal 
print(result)

输出：

[{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']},
 {'author': 'Ernest Hemingway',
  'books': ['A Moveable Feast', 'The sun Also Rises']}]

根据评论进行编辑-选中author和books

checkVal = set()
result = []
for item in data:
    c = tuple(item["books"] + [item["author"]])
    if c not in checkVal:   #Check if author in checkVal 
        result.append(item)              #Append result.
        checkVal.add(c)     #Add author to checkVal 
pprint(result)

Answer 2

dicts = [{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}, {'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']},{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

def remove(dicts):
    for i in range(len(dicts)):
        if dicts[i] in dicts[i+1:]:
            dicts.remove(dicts[i])
            return remove(dicts)
        else:
            return dicts

print (remove(dicts))

输出：

[{'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']}, {'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

Answer 3

您应该编写一些代码，以将您格式的字典转换为可哈希对象。然后，正常的重复数据删除代码（使用set）将起作用：

data = [{'author': 'Stephen King', 'books': ['The stand', 'The Outsider']},
        {'author': 'Ernest Hemingway', 'books': ['A Moveable Feast', 'The sun Also Rises']},
        {'author': 'Stephen King', 'books': ['The stand', 'The Outsider']}]

seen = set()
result = []
for dct in data:
    t = (dct['author'], tuple(dct['books'])) # transform into something hashable
    if t not in seen:
        seen.add(t)
        result.append(dct)

此代码假定您的词典仅具有键'author'和'books'，而没有其他键。如果您想变得更通用，也支持其他键和值，则可以扩展逻辑。这是t的替代计算，它将支持任意键（只要它们都是可比较的）和值中的任意数量的列表：

t = tuple((k, tuple(v) if insinstance(v, list) else v) for k, v in sorted(dct.items())

如何删除包含列表的字典列表中的重复项？

3 个答案: