Pandas:从具有字符串标题的列中删除某些值

时间:2016-11-29 14:10:50

标签: python string pandas dataframe

说我有以下数据框df

        First C        Second C       Third C
0       0.104000       0.864000       -999
1       0.060337       0.812470       -999
2       0.065797       0.819570       0.802607
3       0.064715       0.817212       0.801755

我想删除前两行,因为列Third C显示两个奇怪的值。

df = df.drop(df[df.('Third C') == -999].index)

这引发:

       df = df.drop(df[df.('Third C') == -999].index)
                          ^
SyntaxError: invalid syntax

如果我使用方括号df.['Third C'],也会发生同样的事情。如何在不重命名列的情况下执行此操作?

1 个答案:

答案 0 :(得分:1)

仅使用[]并移除.

df = df.drop(df[df['Third C'] == -999].index)

但最好使用boolean indexing

df = df[df['Third C'] != -999]

<强>计时

drop解决方案速度较慢,因为它使用boolean indexingdrop

In [204]: %timeit (df.drop(df[df['Third C'] == -999].index))
1000 loops, best of 3: 691 µs per loop

In [205]: %timeit (df[df['Third C'] != -999])
1000 loops, best of 3: 359 µs per loop