Python - 如何比较多个dicts并删除重复值?

时间:2017-03-01 13:00:07

标签: python

正如你在这里看到的,我有一个"主要"字典,其中每个值本身就是一个字典。现在我要比较主要的dictonaries' (可以超过2个)" name"彼此的价值观,例如" DE,斯图加特"与" DE,德累斯顿"和X并且只有唯一的"名称"价值留下。

我知道x for x in y if x['key'] != None结构,但据我所知,我只能使用它来过滤单个词典。

输入:

"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": null
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": null
    }
], 

输出:

"DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS", 
            "query": "%23ISIS", 
            "tweet_volume": 21646, 
            "name": "#ISIS", 
            "promoted_content": null
        }
    ], 
    "DE, Dresden": [
    ], 

3 个答案:

答案 0 :(得分:3)

您可以将名称收集到Counter,然后重新构建原始字典,同时仅保留那些具有唯一名称的子字典:

domain.com/contact
domain.com/about-us

输出:

main = {
    "DE, Stuttgart": [
        {
            "url": "http://twitter.com/search?q=%23ISIS",
            "query": "%23ISIS",
            "tweet_volume": 21646,
            "name": "#ISIS",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ],
    "DE, Dresden": [
        {
            "url": "http://twitter.com/search?q=%22Hans+Rosling%22",
            "query": "%22Hans+Rosling%22",
            "tweet_volume": 44855,
            "name": "Hans Rosling",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%22Betsy+DeVos%22",
            "query": "%22Betsy+DeVos%22",
            "tweet_volume": 664741,
            "name": "Betsy DeVos",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=Nioh",
            "query": "Nioh",
            "tweet_volume": 24160,
            "name": "Nioh",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23FCBWOB",
            "query": "%23FCBWOB",
            "tweet_volume": 14216,
            "name": "#FCBWOB",
            "promoted_content": None
        },
        {
            "url": "http://twitter.com/search?q=%23sid2017",
            "query": "%23sid2017",
            "tweet_volume": 28277,
            "name": "#sid2017",
            "promoted_content": None
        }
    ]
}
from collections import Counter
import pprint

names = Counter(d['name'] for l in main.values() for d in l)
result = {k: [d for d in v if names[d['name']] == 1] for k, v in main.items()}

pprint.pprint(result)

答案 1 :(得分:1)

这将为任意数量的位置输出所需的dict。请注意,@ niemmi的解决方案效率更高:

main_dict = {"DE, Stuttgart": [
    {
        "url": "http://twitter.com/search?q=%23ISIS", 
        "query": "%23ISIS", 
        "tweet_volume": 21646, 
        "name": "#ISIS", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
], 
"DE, Dresden": [
    {
        "url": "http://twitter.com/search?q=%22Hans+Rosling%22", 
        "query": "%22Hans+Rosling%22", 
        "tweet_volume": 44855, 
        "name": "Hans Rosling", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%22Betsy+DeVos%22", 
        "query": "%22Betsy+DeVos%22", 
        "tweet_volume": 664741, 
        "name": "Betsy DeVos", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=Nioh", 
        "query": "Nioh", 
        "tweet_volume": 24160, 
        "name": "Nioh", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23FCBWOB", 
        "query": "%23FCBWOB", 
        "tweet_volume": 14216, 
        "name": "#FCBWOB", 
        "promoted_content": None
    }, 
    {
        "url": "http://twitter.com/search?q=%23sid2017", 
        "query": "%23sid2017", 
        "tweet_volume": 28277, 
        "name": "#sid2017", 
        "promoted_content": None
    }
]
}

def get_names(main_dict, location):
    return {small_dict["name"] for small_dict in main_dict[location]}

def get_names_from_other_locations(main_dict, location):
    other_locations = [other_loc for other_loc in main_dict if other_loc != location]
    return {small_dict["name"] for other_location in other_locations for small_dict in main_dict[other_location]}

def get_uniq_names(main_dict, location):
    return get_names(main_dict, location) - get_names_from_other_locations(main_dict, location)

def get_dict(main_dict, location, name):
    for small_dict in main_dict[location]:
        if small_dict["name"] == name:
            return small_dict
    return None

print {location: [get_dict(main_dict,location,uniq_name) for uniq_name in get_uniq_names(main_dict, location)] for location in main_dict }
# {'DE, Stuttgart': [{'url': 'http://twitter.com/search?q=%23ISIS', 'query': '%23ISIS', 'tweet_volume': 21646, 'name': '#ISIS', 'promoted_content': None}], 'DE, Dresden': []}

答案 2 :(得分:0)

让我们说d1d2是你的两本词典。您可以获取d1中不在d2的密钥列表:

[k for k in d if k not in d2]