在Python中过滤具有特定要求的数据框

时间:2019-07-07 07:08:42

标签: python filter

我想过滤一个数据框,但这样做有些困难。

我的数据框看起来像这样:

+--------+----------+--------+-------+-----+------+------+------+--------+------+----+----+----------+-----+-----+------+------+-------------+------+
|  node  |   date   | isSetl | qual  | run | firm | acct | type | isCust | seg  | ec | cc | currency | lov | sov | isM  | pbc  |   spanReq   | anov |
+--------+----------+--------+-------+-----+------+------+------+--------+------+----+----+----------+-----+-----+------+------+-------------+------+
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | USD      |     |     | MNT  | CORE |   124073.69 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | CNY      |     |     |      |      |       43480 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | USD      |     |     |      |      |      117750 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | USD      |     |     | INIT | CORE |   124073.69 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | CNY      |     |     |      |      |       43480 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | USD      |     |     |      |      |      117750 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | CNH      |     |     | MNT  | CORE |           0 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | CNY      |     |     |      |      |      986680 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |    28786701 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | USD      |     |     |      |      |       67790 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | CNH      |     |     | INIT | CORE |           0 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | CNY      |     |     |      |      |      986680 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |    28786701 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | USD      |     |     |      |      |       67790 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | HKD      |     |     | MNT  | CORE | 17381842.35 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |      245850 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | USD      |     |     |      |      |     2193000 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | HKD      |     |     | INIT | CORE | 17381842.35 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |      245850 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | USD      |     |     |      |      |     2193000 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   70 | S    | TRUE   | CUST |    |    | HKD      |     |     | MNT  | CORE |      163900 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   70 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |      163900 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   70 | S    | TRUE   | CUST |    |    | HKD      |     |     | INIT | CORE |      163900 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   70 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |      163900 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   80 | S    | TRUE   | CUST |    |    | HKD      |     |     | MNT  | CORE |    25733800 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   80 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |    25733800 |    0 |
| oReq   | 20190627 | TRUE   | final |   0 | FCG  |   80 | S    | TRUE   | CUST |    |    | HKD      |     |     | INIT | CORE |    25733800 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   80 | S    | TRUE   | CUST |    |    | HKD      |     |     |      |      |    25733800 |    0 |
+--------+----------+--------+-------+-----+------+------+------+--------+------+----+----+----------+-----+-----+------+------+-------------+------+

我想过滤,以便我需要列“ isM”中INIT下方的行

我想要的输出:

+--------+----------+--------+-------+-----+------+------+------+--------+------+----+----+----------+-----+-----+-----+-----+----------+------+
|  node  |   date   | isSetl | qual  | run | firm | acct | type | isCust | seg  | ec | cc | currency | lov | sov | isM | pbc | spanReq  | anov |
+--------+----------+--------+-------+-----+------+------+------+--------+------+----+----+----------+-----+-----+-----+-----+----------+------+
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | CNY      |     |     |     |     |    43480 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   10 | S    | TRUE   | CUST |    |    | USD      |     |     |     |     |   117750 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | CNY      |     |     |     |     |   986680 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | HKD      |     |     |     |     | 28786701 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   40 | S    | TRUE   | CUST |    |    | USD      |     |     |     |     |    67790 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | HKD      |     |     |     |     |   245850 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   60 | S    | TRUE   | CUST |    |    | USD      |     |     |     |     |  2193000 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   70 | S    | TRUE   | CUST |    |    | HKD      |     |     |     |     |   163900 |    0 |
| curReq | 20190627 | TRUE   | final |   0 | FCG  |   80 | S    | TRUE   | CUST |    |    | HKD      |     |     |     |     | 25733800 |    0 |
+--------+----------+--------+-------+-----+------+------+------+--------+------+----+----+----------+-----+-----+-----+-----+----------+------+

我该如何过滤才能使输出保持原样?

在这方面需要一些指导。

1 个答案:

答案 0 :(得分:1)

这应该可以解决问题。它将两个临时列添加到数据帧(df):

  • temp来跟踪isM列等于INIT的行。这些行将在以后删除。

  • temp_ism向前填充isM列,以便查找INIT之后的所有行。

    df = df.assign(temp=df['isM'].eq('INIT'), temp_ism=df['isM'].ffill())
    # Drop the first `INIT` rows (`~df['temp']`) but retain the following rows.
    result = df[df['temp_ism'].eq('INIT') & ~df['temp']].iloc[:, :-2]  # Drop the two temporary columns.
    df = df.iloc[:, :-2]  # Drop the two temporary columns from the original dataframe.