Question

d = {'Name1': ['Male', '18'],
     'Name2': ['Male', '16'], 
     'Name3': ['Male', '18'],
     'Name4': ['Female', '18'], 
     'Name5': ['Female', '18']}

我正在尝试找到一种方法将重复键隔离到列表中（如果有的话）。像：

['Name1', 'Name3']
['Name4', 'Name5']

我怎样才能做到这一点？谢谢

Answer 1

一个必要的解决方案是迭代字典并将项目添加到另一个使用gender-age-tuple作为键的字典中，例如：

# using a defaultdict, which automatically adds an empty list for missing keys when first accesses
from collections import defaultdict
by_data = defaultdict(list) 
for name, data in d.items():
    # turn the data into something immutable, so it can be used as a dictionary key
    data_tuple = tuple(data)
    by_data[data_tuple].append(name)

结果将是：

{('Female', '18'): ['Name4', 'Name5'],
 ('Male', '16'): ['Name2'],
 ('Male', '18'): ['Name1', 'Name3']})

如果您只对重复项感兴趣，则可以过滤掉只有一个值的条目

Answer 2

我猜你的意思是重复的值而不是键，在这种情况下你可以用pandas做到这一点：

import pandas as pd
df = pd.DataFrame(d).T #load the data into a dataframe, and transpose it
df.index[df.duplicated(keep = False)]

df.duplicated(keep = False)会为您提供一系列True / False，只要该项有重复，其值为True，否则为False。我们使用它来索引行名称，即'Name1','Name2'等。

Answer 3

试试这个：

d = {'Name1': ['Male', '18'],
 'Name2': ['Male', '16'], 
 'Name3': ['Male', '18'],
 'Name4': ['Female', '18'], 
 'Name5': ['Female', '18']}

ages = {} #create a dictionary to hold items with identical ages

#loop over all the items in the dictionary
for key in d.keys():
    age = d[key][1]

    #if the ages dictionary still does not have an item 
    #for the age we create an array to hold items with the same age
    if(age not in ages.keys()):
        ages[age] = [] 

    ages[age].append(key) #finally append items with the same ages together

#loop over all the items in the ages dictionary
for value in ages.values():
    if(len(value) > 1):#if we have more than one item in the ages dictionary
        print(value) #print it

隔离具有多个重复值的字典键

3 个答案: