说我有以下数据框df
:
First C Second C Third C
0 0.104000 0.864000 -999
1 0.060337 0.812470 -999
2 0.065797 0.819570 0.802607
3 0.064715 0.817212 0.801755
我想删除前两行,因为列Third C
显示两个奇怪的值。
df = df.drop(df[df.('Third C') == -999].index)
这引发:
df = df.drop(df[df.('Third C') == -999].index)
^
SyntaxError: invalid syntax
如果我使用方括号df.['Third C']
,也会发生同样的事情。如何在不重命名列的情况下执行此操作?
答案 0 :(得分:1)
仅使用[]
并移除.
:
df = df.drop(df[df['Third C'] == -999].index)
但最好使用boolean indexing
:
df = df[df['Third C'] != -999]
<强>计时强>:
drop
解决方案速度较慢,因为它使用boolean indexing
和drop
:
In [204]: %timeit (df.drop(df[df['Third C'] == -999].index))
1000 loops, best of 3: 691 µs per loop
In [205]: %timeit (df[df['Third C'] != -999])
1000 loops, best of 3: 359 µs per loop