Question

以下是pandas DataFrame示例：

import pandas as pd
import numpy as np

data = {"first_column": ["item1", "item2", "item3", "item4", "item5", "item6", "item7"],
        "second_column": ["cat1", "cat1", "cat1", "cat2", "cat2", "cat2", "cat2"],
        "third_column": [5, 1, 8, 3, 731, 189, 9]}

df = pd.DataFrame(data)

df
     first_column second_column  third_column
0        item1          cat1             5
1        item2          cat1             1
2        item3          cat1             8
3        item4          cat2             3
4        item5          cat2           731
5        item6          cat2           189
6        item7          cat2             9

我想基于10 =＆lt;过滤“第三列。 x =＆lt; 1000。

如果我大于或等于10，则为：

df['greater_than_ten'] = df.third_column.ge(10).astype(np.uint8)

如果我的成绩不到1000，那就是：

df['less_than_1K'] = df.third_column.le(1000).astype(np.uint8)

但我无法同时进行这些操作，即

df['both'] = df.third_column.le(1000).ge(10).astype(np.uint8)

我也不能按顺序尝试这些操作。

如何一起使用.ge()和.le()？

Answer 1

您可以使用between()代替您感兴趣的系列。

df['both'] = df.third_column.between(10, 1000).astype(np.uint8)

屈服

>>> df

  first_column second_column  third_column  both
0        item1          cat1             5     0
1        item2          cat1             1     0
2        item3          cat1             8     0
3        item4          cat2             3     0
4        item5          cat2           731     1
5        item6          cat2           189     1
6        item7          cat2             9     0

Answer 2

使用&来复合条件：

In [28]:
df['both'] = df['third_column'].ge(10) & df['third_column'].le(1000)
df

Out[28]:
  first_column second_column  third_column   both
0        item1          cat1             5  False
1        item2          cat1             1  False
2        item3          cat1             8  False
3        item4          cat2             3  False
4        item5          cat2           731   True
5        item6          cat2           189   True
6        item7          cat2             9  False

Answer 3

In [11]: df['both'] = df.eval("10 <= third_column <= 1000").astype(np.uint8)

In [12]: df
Out[12]:
  first_column second_column  third_column  both
0        item1          cat1             5     0
1        item2          cat1             1     0
2        item3          cat1             8     0
3        item4          cat2             3     0
4        item5          cat2           731     1
5        item6          cat2           189     1
6        item7          cat2             9     0

<强>更新

In [13]: df.eval("second_column in ['cat2'] and 10 <= third_column <= 1000").astype(np.uint8)
Out[13]:
0    0
1    0
2    0
3    0
4    1
5    1
6    0
dtype: uint8

过滤pandas数据框列时如何使用.le（）和.ge（）？

3 个答案: