从两个dicts列表中删除匹配项

时间:2017-07-18 01:30:34

标签: python dictionary

我需要两个字典并过滤掉垃圾'无法识别名称的项目:

data = [
    {'annotation_id': 22, 'record_id': 5, 'name': 'Joe Young'},
    {'annotation_id': 13, 'record_id': 7, 'name': '----'},
    {'annotation_id': 12, 'record_id': 9, 'name': 'Greg Band'},
]

garbage = [
    {'annotation_id': 13, 'record_id': 7, 'name': '----'}
]

所以在这种情况下我需要从数据中删除annotation_id 13.

我尝试迭代列表并删除它但我明白在python中不能很好地工作。我也尝试了列表理解,但也失败了。我做错了什么?我的代码如下:

data = [[item for item in data if item['name'] != g['name'] for g in garbage]

上面的代码创建了许多重复版本的dicts。

3 个答案:

答案 0 :(得分:3)

删除dicts数组中特定条目的简单而优雅的方法,其中garbage是要从data中删除的dicts条目列表:

 for g in garbage:
    if g in data:
        data.remove(g)

输入数据:

data = [
    {'annotation_id': 22, 'record_id': 5, 'name': 'Joe Young'},
    {'annotation_id': 13, 'record_id': 7, 'name': '----'},
    {'annotation_id': 12, 'record_id': 9, 'name': 'Greg Band'},
]

garbage = [
    {'annotation_id': 13, 'record_id': 7, 'name': '----'}
]

<强>结果:

data = [
    {'record_id': 5, 'annotation_id': 22, 'name': 'Joe Young'}, 
    {'record_id': 9, 'annotation_id': 12, 'name': 'Greg Band'}
]

答案 1 :(得分:1)

您可以创建一个集来保存垃圾名称,然后根据此名称集过滤数据(如果 name 是您需要过滤的标准):< / p>

garbage_names = {d['name'] for d in garbage}

[item for item in data if item['name'] not in garbage_names]
#[{'annotation_id': 22, 'name': 'Joe Young', 'record_id': 5},
# {'annotation_id': 12, 'name': 'Greg Band', 'record_id': 9}]

正如评论中所指出的那样,您也可以按照原始方法执行[item for item in data if all(item['name'] != g['name'] for g in garbage)],但由于双循环具有O(M * N)的时间复杂度,因此效率会略低一些一组将时间复杂度降低到O(M + N),这里有一些天真的时间:

%timeit [item for item in data if all(item['name'] != g['name'] for g in garbage)]
# 1000000 loops, best of 3: 1.68 µs per loop

%%timeit
garbage_names = {d['name'] for d in garbage}
[item for item in data if item['name'] not in garbage_names]
# 1000000 loops, best of 3: 608 ns per loop

答案 2 :(得分:1)

一个简单的filter怎么样?

filter(lambda x: x not in garbage, data)

[{'annotation_id': 22, 'name': 'Joe Young', 'record_id': 5},
 {'annotation_id': 12, 'name': 'Greg Band', 'record_id': 9}]