删除字典中的重复值

时间:2020-07-15 10:05:16

标签: python dictionary

我要在提取所需的“比率”和“体裁”数据后删除词典中的重复值

a=[{'movie': 'abc', 'rate': '9', 'origin': 'AU', 'genre': 'horror'},
   {'movie': 'xyz', 'rate': '7', 'origin': 'NY', 'genre': 'romance'},
   {'movie': 'jkl', 'rate': '9', 'origin': 'HK', 'genre': 'horror'},
   {'movie': 'qwe', 'rate': '6', 'origin': 'HK', 'genre': 'comedy'},
   {'movie': 'vbn', 'rate': '9', 'origin': 'BKK', 'genre': 'romance'}]

needed_data=[]
for test in a:
    x={}
    word=['rate','genre']
    for key,value in test.items():
        for words in word:
            if key == words:
                x[key] = value

    needed_data.append(x)

results = {}
filters=[]
for yy in needed_data:
    for key,value in yy.items():
        if value not in results.values():
            results[key] = value
    filters.append(results)
print(filters)

以上代码的输出为

[{'rate': '9', 'genre': 'romance'}, 
{'rate': '9', 'genre': 'romance'}, 
{'rate': '9', 'genre': 'romance'}, 
{'rate': '9', 'genre': 'romance'}, 
{'rate': '9', 'genre': 'romance'}]

我想要的输出是

[{'rate': '9', 'genre': 'horror'}, 
{'rate': '7', 'genre': 'romance'},  
{'rate': '6', 'genre': 'comedy'}, 
{'rate': '9', 'genre': 'romance'}]

3 个答案:

答案 0 :(得分:1)

我建议使用熊猫进行数据处理

import pandas as pd
df = pd.DataFrame(a)
df_dd= df[["genre", "rate"]].drop_duplicates()
new_a = df_dd.to_dict(orient="records")
print(new_a)

输出

[{'genre': 'horror', 'rate': '9.'}, 
 {'genre': 'romance', 'rate': '7'}, 
 {'genre': 'horror', 'rate': '9'}, 
 {'genre': 'comedy', 'rate': '6'}, 
 {'genre': 'romance', 'rate': '9'}]

答案 1 :(得分:0)

您的数据具有字符串“ 9”。和“ 9”,您想要那样吗?

z = {f"{float(x['rate']):.2f}-{x['genre']}": x for x in needed_data}  
list(z.values())

输出

[{'rate': '9', 'genre': 'horror'},
 {'rate': '7', 'genre': 'romance'},
 {'rate': '6', 'genre': 'comedy'},
 {'rate': '9', 'genre': 'romance'}]

答案 2 :(得分:0)

这是完成任务的简单方法:

a=[{'movie': 'abc', 'rate': '9.', 'origin': 'AU', 'genre': 'horror'},
   {'movie': 'xyz', 'rate': '7', 'origin': 'NY', 'genre': 'romance'},
   {'movie': 'jkl', 'rate': '9', 'origin': 'HK', 'genre': 'horror'},
   {'movie': 'qwe', 'rate': '6', 'origin': 'HK', 'genre': 'comedy'},
   {'movie': 'vbn', 'rate': '9', 'origin': 'BKK', 'genre': 'romance'}]
c = []
for b in a:
    c.append({'rate':b['rate'],'genre':b['genre'] })
print(c)

所以输出将是:

[{'rate': '9.', 'genre': 'horror'}, {'rate': '7', 'genre': 'romance'}, {'rate': '9', 'genre': 'horror'}, {'rate': '6', 'genre': 'comedy'}, {'rate': '9', 'genre': 'romance'}]