import pandas as pd
df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
{'fruit': 'apple', 'color': 'red', 'response': 'wrong'},
{'fruit': 'pineapple', 'color': 'green', 'response': 'True' },
{'fruit': 'pineapple', 'color': 'green', 'response': 'wrong' },
{'fruit': 'orange', 'color': 'orange', 'response': 'wrong' }]
df = pd.DataFrame(df)
我要删除重复的水果和颜色观测值的组合,其中响应=“错误”
答案 0 :(得分:0)
您可以使用drop_duplicates
例如:
import pandas as pd
df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
{'fruit': 'apple', 'color': 'red', 'response': 'wrong'},
{'fruit': 'pineapple', 'color': 'green', 'response': 'True' },
{'fruit': 'pineapple', 'color': 'green', 'response': 'wrong' },
{'fruit': 'orange', 'color': 'orange', 'response': 'wrong' }]
df = pd.DataFrame(df)
print(df.drop_duplicates(['fruit','color']))
输出:
color fruit response
0 red apple right
2 green pineapple True
4 orange orange wrong
答案 1 :(得分:0)
首先对“响应”列进行排序
df.sort_values(['response'], inplace=True)
输出
color fruit response
2 green pineapple True
0 red apple right
1 red apple wrong
3 green pineapple wrong
4 orange orange wrong
然后您可以使用删除重复的值
df.drop_duplicates(['color','fruit'], inplace = True)
输出
color fruit response
2 green pineapple True
0 red apple right
4 orange orange wrong
您可以使用-
将数据框转换为与排序之前相同的顺序df.sort_index(axis=0, inplace= True)
输出
color fruit response
0 red apple right
2 green pineapple True
4 orange orange wrong
这将为您提供所需的输出