Question

我有一个如下所示的数据集：

想要放下像4,5＆amp;因为大多数列具有0但不是全部。同时，我不想删除0和1之类的行，因为它们只有很少的条目为0。

Answer 1

首先创建一个列来计算行中的零

df['no_of_zeros']=(df == 0).astype(int).sum(axis=1)

定义行中可接受的零数，并根据它过滤数据帧。

df=df[df['no_of_zeros'] < 3].drop(['no_of_zeros'], axis=1)

Answer 2

这是一种方式：

import pandas as pd

df = pd.DataFrame([[0, 1, 2, 3, 4],
                   [0, 0, 0, 1, 2]],
                  columns=['A', 'B', 'C', 'D', 'E'])

df = df[~((df == 0).astype(int).sum(axis=1) > len(df.columns) / 2)]

#    A  B  C  D  E
# 0  0  1  2  3  4

Answer 3

假设“多数”意味着“超过一半的列”，这有效：

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'c2': {0: 76, 1: 45, 2: 47, 3: 92, 4: 0, 5: 0, 6: 26, 7: 0, 8: 71},
   ...:  'c3': {0: 0, 1: 3, 2: 6, 3: 9, 4: 0, 5: 0, 6: 12, 7: 0, 8: 15},
   ...:  'c4': {0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1},
   ...:  'c5': {0: 23, 1: 0, 2: 23, 3: 23, 4: 0, 5: 0, 6: 23, 7: 0, 8: 23},
   ...:  'c6': {0: 65, 1: 25, 2: 62, 3: 26, 4: 52, 5: 22, 6: 65, 7: 0, 8: 69},
   ...:  'c7': {0: 12, 1: 12, 2: 12, 3: 12, 4: 12, 5: 12, 6: 12, 7: 12, 8: 12},
   ...:  'c8': {0: 0, 1: 0, 2: 8, 3: 9, 4: 0, 5: 0, 6: 4, 7: 0, 8: 4},
   ...:  'cl': {0: 5, 1: 7, 2: 8, 3: 15, 4: 0, 5: 0, 6: 2, 7: 0, 8: 5},
   ...:  'sr': {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8}})
   ...:  

In [3]: df
Out[3]: 
   c2  c3  c4  c5  c6  c7  c8  cl  sr
0  76   0   1  23  65  12   0   5   0
1  45   3   1   0  25  12   0   7   1
2  47   6   1  23  62  12   8   8   2
3  92   9   1  23  26  12   9  15   3
4   0   0   1   0  52  12   0   0   4
5   0   0   1   0  22  12   0   0   5
6  26  12   1  23  65  12   4   2   6
7   0   0   1   0   0  12   0   0   7
8  71  15   1  23  69  12   4   5   8

In [4]: df[((df == 0).sum(axis=1) <= len(df.columns) / 2)]
Out[4]: 
   c2  c3  c4  c5  c6  c7  c8  cl  sr
0  76   0   1  23  65  12   0   5   0
1  45   3   1   0  25  12   0   7   1
2  47   6   1  23  62  12   8   8   2
3  92   9   1  23  26  12   9  15   3
6  26  12   1  23  65  12   4   2   6
8  71  15   1  23  69  12   4   5   8

In [5]:

丢弃大部分为0的pandas DF行

3 个答案: