python panda dataframe:参数'keep'在drop_duplicates()/ duplicated中不起作用

时间:2016-05-11 11:44:03

标签: python pandas dataframe duplicates

假设我们有一个数据框df

df=pd.DataFrame()
df['c1']=[1,2,3,3,4]
df['c2']=["a1","a2","a2","a2","a1"]
df['c3']=[1,2,3,3,5]

如果我使用df.drop_duplicates(keep=False)df.duplicated(keep=False),我会收到以下错误:

File "C:\Users\Kanika\Anaconda\lib\site-packages\pandas\util\decorators.py", line 88, in wrapper
return func(*args, **kwargs)

TypeError: duplicated() got an unexpected keyword argument 'keep'

1 个答案:

答案 0 :(得分:2)

您应该更新您的pandas版本,因为它是从0.17.0版本what's new in v. 0.17.0添加的:

  
      
  • drop_duplicatesduplicated现在接受keep关键字定位   第一个,最后一个,所有重复。
  •   

两者都适用于panda 0.18.1

In [116]: df
Out[116]:
   c1  c2  c3
0   1  a1   1
1   2  a2   2
2   3  a2   3
3   3  a2   3
4   4  a1   5

In [117]: df.drop_duplicates()
Out[117]:
   c1  c2  c3
0   1  a1   1
1   2  a2   2
2   3  a2   3
4   4  a1   5

In [118]: df.drop_duplicates(keep=False)
Out[118]:
   c1  c2  c3
0   1  a1   1
1   2  a2   2
4   4  a1   5