隔离具有多个重复值的字典键

时间:2017-10-26 09:08:18

标签: python dictionary duplicates

d = {'Name1': ['Male', '18'],
     'Name2': ['Male', '16'], 
     'Name3': ['Male', '18'],
     'Name4': ['Female', '18'], 
     'Name5': ['Female', '18']}

我正在尝试找到一种方法将重复键隔离到列表中(如果有的话)。像:

['Name1', 'Name3']
['Name4', 'Name5']

我怎样才能做到这一点?谢谢

3 个答案:

答案 0 :(得分:2)

一个必要的解决方案是迭代字典并将项目添加到另一个使用gender-age-tuple作为键的字典中,例如:

# using a defaultdict, which automatically adds an empty list for missing keys when first accesses
from collections import defaultdict
by_data = defaultdict(list) 
for name, data in d.items():
    # turn the data into something immutable, so it can be used as a dictionary key
    data_tuple = tuple(data)
    by_data[data_tuple].append(name)

结果将是:

{('Female', '18'): ['Name4', 'Name5'],
 ('Male', '16'): ['Name2'],
 ('Male', '18'): ['Name1', 'Name3']})

如果您只对重复项感兴趣,则可以过滤掉只有一个值的条目

答案 1 :(得分:0)

我猜你的意思是重复的值而不是键,在这种情况下你可以用pandas做到这一点:

import pandas as pd
df = pd.DataFrame(d).T #load the data into a dataframe, and transpose it
df.index[df.duplicated(keep = False)] 

df.duplicated(keep = False)会为您提供一系列True / False,只要该项有重复,其值为True,否则为False。我们使用它来索引行名称,即'Name1','Name2'等。

答案 2 :(得分:0)

试试这个:

d = {'Name1': ['Male', '18'],
 'Name2': ['Male', '16'], 
 'Name3': ['Male', '18'],
 'Name4': ['Female', '18'], 
 'Name5': ['Female', '18']}

ages = {} #create a dictionary to hold items with identical ages

#loop over all the items in the dictionary
for key in d.keys():
    age = d[key][1]

    #if the ages dictionary still does not have an item 
    #for the age we create an array to hold items with the same age
    if(age not in ages.keys()):
        ages[age] = [] 

    ages[age].append(key) #finally append items with the same ages together

#loop over all the items in the ages dictionary
for value in ages.values():
    if(len(value) > 1):#if we have more than one item in the ages dictionary
        print(value) #print it