假设我们有一个数据框df
df=pd.DataFrame()
df['c1']=[1,2,3,3,4]
df['c2']=["a1","a2","a2","a2","a1"]
df['c3']=[1,2,3,3,5]
如果我使用df.drop_duplicates(keep=False)
或df.duplicated(keep=False)
,我会收到以下错误:
File "C:\Users\Kanika\Anaconda\lib\site-packages\pandas\util\decorators.py", line 88, in wrapper
return func(*args, **kwargs)
TypeError: duplicated() got an unexpected keyword argument 'keep'
答案 0 :(得分:2)
您应该更新您的pandas版本,因为它是从0.17.0
版本what's new in v. 0.17.0添加的:
drop_duplicates
和duplicated
现在接受keep
关键字定位 第一个,最后一个,所有重复。
两者都适用于panda 0.18.1
:
In [116]: df
Out[116]:
c1 c2 c3
0 1 a1 1
1 2 a2 2
2 3 a2 3
3 3 a2 3
4 4 a1 5
In [117]: df.drop_duplicates()
Out[117]:
c1 c2 c3
0 1 a1 1
1 2 a2 2
2 3 a2 3
4 4 a1 5
In [118]: df.drop_duplicates(keep=False)
Out[118]:
c1 c2 c3
0 1 a1 1
1 2 a2 2
4 4 a1 5