如果有超过K行的零,请删除Pandas Dataframe列

时间:2016-11-04 06:58:27

标签: python pandas

我有以下数据框:

import pandas as pd
df = pd.DataFrame({'id':['a','b','c','d','e'],
                   'A':[-14,-90,-90,-96,-91],
                   'B':[-103,0,-110,-114,-114],
                   'D':[0,0,0,0,0],
                   'C':[-101,0,-110,0,0]})

看起来像这样:

    A    B    C  D id
0 -14 -103 -101  0  a
1 -90    0    0  0  b
2 -90 -110 -110  0  c
3 -96 -114    0  0  d
4 -91 -114    0  0  e

我想要做的是如果超过2行中有0,则执行删除任何列的操作。我怎样才能做到这一点?

最后将包含此列的数据框:A,B,id。

1 个答案:

答案 0 :(得分:3)

您可以将cumsumany一起用于掩码,然后稍微更改boolean indexing以便按列进行选择:

mask = ((df == 0).cumsum() > 1).any()
print (mask)
A     False
B     False
C      True
id    False
dtype: bool

print (df.ix[:, ~mask])
    A    B id
0 -14 -103  a
1 -90    0  b
2 -90 -110  c
3 -96 -114  d
4 -91 -114  e

面具说明:

print (df == 0)
       A      B      C     id
0  False  False  False  False
1  False   True   True  False
2  False  False  False  False
3  False  False   True  False
4  False  False   True  False

print ((df == 0).cumsum())
   A  B  C  id
0  0  0  0   0
1  0  1  1   0
2  0  1  1   0
3  0  1  2   0
4  0  1  3   0

print ((df == 0).cumsum() > 1)
       A      B      C     id
0  False  False  False  False
1  False  False  False  False
2  False  False  False  False
3  False  False  False  False
4  False  False   True  False

EDIt评论 - 掩护需要all

mask = (df == 0).all()
print (mask)
A     False
B     False
C     False
D      True
id    False
dtype: bool

print (df.ix[:, ~mask])
    A    B    C id
0 -14 -103 -101  a
1 -90    0    0  b
2 -90 -110 -110  c
3 -96 -114    0  d
4 -91 -114    0  e