从字典中按值删除重复项

时间:2016-08-08 23:04:29

标签: python dictionary list-comprehension

我有以下词典:

potential_duplicates = {
  432L: (u'one two three', u'one two three'), 
  433L: (u'one two three', u'one two three'), 
  434L: (u'whole foods', u'whole foods'), 
  435L: (u'whole foods', u'whole foods'),
  437L: (u'this is a dupe', u'this is a dupe'),
  438L: (u'this is a dupe', u'this is a dupe'), 
  439L: (u'this is a dupe', u'this is a dupe')
}

基本上我正在删除数据库中项目的重复条目,所以基本上我想在这里至少保留其中一项,并将另一项放入需要删除的重复项列表中。

我可以使用这种结构吗或者我应该使用列表吗?

1 个答案:

答案 0 :(得分:0)

您可以使用两个嵌套字典理解来完成此操作。内部的一个通过反转键和值来合并重复项,而外部的一个以原始形式重建它。

>>> {k:v for v,k in {v:k for k,v in potential_duplicates.items()}.items()}
{433L: (u'one two three', u'one two three'), 435L: (u'whole foods', u'whole foods'), 439L: (u'this is a dupe', u'this is a dupe')}

要获取已删除的键列表,请使用列表推导来比较这两个词:

>>> kept = {k:v for v,k in {v:k for k,v in potential_duplicates.items()}.items()}
>>> removed = [k for k in potential_duplicates.keys() if k not in kept]
>>> removed
[432L, 434L, 437L, 438L]