熊猫塔回填减少/增加

时间:2020-04-14 03:45:01

标签: pandas

我有DataFrame

| ind  |   A  |    B   |
------------------------
| 1.01 |  10  | -1.734 |
| 1.04 |  10  | -1.244 |
| 1.05 |  10  |  0.016 |
| 1.11 |  NaN | -2.737 | <-
| 1.13 |  NaN | -4.232 | <-
| 1.19 |  11  | -3.241 | <=
| 1.20 |  12  | -2.832 |
| 1.21 |  10  | -4.277 |

,并希望使用以下一个有效值结尾的递减序列回填NaN值

| ind  |   A  |    B   |
------------------------
| 1.01 |  10  | -1.734 |
| 1.04 |  10  | -1.244 |
| 1.05 |  10  |  0.016 |
| 1.11 |  13  | -2.737 | <-
| 1.13 |  12  | -4.232 | <-
| 1.19 |  11  | -3.241 | <=
| 1.20 |  12  | -2.832 |
| 1.21 |  10  | -4.277 |

有没有办法做到这一点?

1 个答案:

答案 0 :(得分:0)

获取找到NaN的位置

positions = df['A'].isna().astype(int)

|  positions |
--------------
|      0     |
|      0     |
|      0     |
|      1     |
|      1     |
|      0     |
|      0     |
|      0     |

然后做反向累计和:

mask = df['A'].isna().astype(int).loc[::-1]
cumSum = mask.cumsum()
posCumSum = (cumSum - cumSum.where(~mask).ffill().fillna(0).astype(int)).loc[::-1]

|  posCumSum |
--------------
|      0     |
|      0     |
|      0     |
|      2     |
|      1     |
|      0     |
|      0     |
|      0     |

将其添加到回填的原始列中

df['A'] = df['A'].bfill() + posCumSum

| ind  |   A  |    B   |
------------------------
| 1.01 |  10  | -1.734 |
| 1.04 |  10  | -1.244 |
| 1.05 |  10  |  0.016 |
| 1.11 |  13  | -2.737 | <-
| 1.13 |  12  | -4.232 | <-
| 1.19 |  11  | -3.241 | <=
| 1.20 |  12  | -2.832 |
| 1.21 |  10  | -4.277 |