有关数据操作的查询

时间:2018-07-20 09:53:36

标签: python pandas

import pandas as pd

df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
     {'fruit': 'apple',  'color': 'red', 'response': 'wrong'},
     {'fruit': 'pineapple',  'color': 'green',  'response': 'True' },
     {'fruit': 'pineapple',  'color': 'green',  'response': 'wrong' },
     {'fruit': 'orange',  'color': 'orange',  'response': 'wrong' }]



df = pd.DataFrame(df)

我要删除重复的水果和颜色观测值的组合,其中响应=“错误”

2 个答案:

答案 0 :(得分:0)

您可以使用drop_duplicates

例如:

import pandas as pd
df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
     {'fruit': 'apple',  'color': 'red', 'response': 'wrong'},
     {'fruit': 'pineapple',  'color': 'green',  'response': 'True' },
     {'fruit': 'pineapple',  'color': 'green',  'response': 'wrong' },
     {'fruit': 'orange',  'color': 'orange',  'response': 'wrong' }]

df = pd.DataFrame(df)
print(df.drop_duplicates(['fruit','color']))

输出:

    color      fruit response
0     red      apple    right
2   green  pineapple     True
4  orange     orange    wrong

答案 1 :(得分:0)

首先对“响应”列进行排序

df.sort_values(['response'], inplace=True)

输出

   color      fruit response 
2   green  pineapple     True
0     red      apple    right
1     red      apple    wrong
3   green  pineapple    wrong
4  orange     orange    wrong

然后您可以使用删除重复的值

df.drop_duplicates(['color','fruit'], inplace = True)

输出

    color      fruit response
2   green  pineapple     True
0     red      apple    right
4  orange     orange    wrong

您可以使用-

将数据框转换为与排序之前相同的顺序
df.sort_index(axis=0, inplace= True)

输出

    color      fruit response
0     red      apple    right
2   green  pineapple     True
4  orange     orange    wrong

这将为您提供所需的输出