我有以下词典:
potential_duplicates = {
432L: (u'one two three', u'one two three'),
433L: (u'one two three', u'one two three'),
434L: (u'whole foods', u'whole foods'),
435L: (u'whole foods', u'whole foods'),
437L: (u'this is a dupe', u'this is a dupe'),
438L: (u'this is a dupe', u'this is a dupe'),
439L: (u'this is a dupe', u'this is a dupe')
}
基本上我正在删除数据库中项目的重复条目,所以基本上我想在这里至少保留其中一项,并将另一项放入需要删除的重复项列表中。
我可以使用这种结构吗或者我应该使用列表吗?
答案 0 :(得分:0)
您可以使用两个嵌套字典理解来完成此操作。内部的一个通过反转键和值来合并重复项,而外部的一个以原始形式重建它。
>>> {k:v for v,k in {v:k for k,v in potential_duplicates.items()}.items()}
{433L: (u'one two three', u'one two three'), 435L: (u'whole foods', u'whole foods'), 439L: (u'this is a dupe', u'this is a dupe')}
要获取已删除的键列表,请使用列表推导来比较这两个词:
>>> kept = {k:v for v,k in {v:k for k,v in potential_duplicates.items()}.items()}
>>> removed = [k for k in potential_duplicates.keys() if k not in kept]
>>> removed
[432L, 434L, 437L, 438L]