d = {'Name1': ['Male', '18'],
'Name2': ['Male', '16'],
'Name3': ['Male', '18'],
'Name4': ['Female', '18'],
'Name5': ['Female', '18']}
我正在尝试找到一种方法将重复键隔离到列表中(如果有的话)。像:
['Name1', 'Name3']
['Name4', 'Name5']
我怎样才能做到这一点?谢谢
答案 0 :(得分:2)
一个必要的解决方案是迭代字典并将项目添加到另一个使用gender-age-tuple作为键的字典中,例如:
# using a defaultdict, which automatically adds an empty list for missing keys when first accesses
from collections import defaultdict
by_data = defaultdict(list)
for name, data in d.items():
# turn the data into something immutable, so it can be used as a dictionary key
data_tuple = tuple(data)
by_data[data_tuple].append(name)
结果将是:
{('Female', '18'): ['Name4', 'Name5'],
('Male', '16'): ['Name2'],
('Male', '18'): ['Name1', 'Name3']})
如果您只对重复项感兴趣,则可以过滤掉只有一个值的条目
答案 1 :(得分:0)
我猜你的意思是重复的值而不是键,在这种情况下你可以用pandas做到这一点:
import pandas as pd
df = pd.DataFrame(d).T #load the data into a dataframe, and transpose it
df.index[df.duplicated(keep = False)]
df.duplicated(keep = False)
会为您提供一系列True / False,只要该项有重复,其值为True
,否则为False。我们使用它来索引行名称,即'Name1','Name2'
等。
答案 2 :(得分:0)
试试这个:
d = {'Name1': ['Male', '18'],
'Name2': ['Male', '16'],
'Name3': ['Male', '18'],
'Name4': ['Female', '18'],
'Name5': ['Female', '18']}
ages = {} #create a dictionary to hold items with identical ages
#loop over all the items in the dictionary
for key in d.keys():
age = d[key][1]
#if the ages dictionary still does not have an item
#for the age we create an array to hold items with the same age
if(age not in ages.keys()):
ages[age] = []
ages[age].append(key) #finally append items with the same ages together
#loop over all the items in the ages dictionary
for value in ages.values():
if(len(value) > 1):#if we have more than one item in the ages dictionary
print(value) #print it