Question

我有数据框：

DF：

customer    sample1     sample2 sample3 sample4
costprice1  10  21  32  43
costprice2  12  24  15  18
costprice3  1   2   15  8
costprice4  16  30  44  58
costprice5  18  33  48  63
costprice6  20  36  52  68
costprice7  22  39  56  73
costprice8  24  42  60  78
costprice9  26  45  64  83
costprice10 28  48  68  88

我想删除超过2列的值小于15的行

所以这将被删除

costprice3  1   2   15  8

在R中我们可以做到

df[rowSums(df < 15) <=2 , , drop = FALSE]

这可以在pandas中完成，我已经使用pandas any来过滤掉行

df_fitered = df[(df > threshold).any(1)]

Answer 1

In [16]: df[df.select_dtypes(['number']).lt(15).sum(axis=1) < 3]
Out[16]:
      customer  sample1  sample2  sample3  sample4
0   costprice1       10       21       32       43
1   costprice2       12       24       15       18
3   costprice4       16       30       44       58
4   costprice5       18       33       48       63
5   costprice6       20       36       52       68
6   costprice7       22       39       56       73
7   costprice8       24       42       60       78
8   costprice9       26       45       64       83
9  costprice10       28       48       68       88

奖金回答：

mask = <condition1>
df[mask & (df.select_dtypes(['number']).lt(15).sum(axis=1) < 3)]

pandas的功能类似于R

1 个答案: