给出如下数据
Time Col01 Col02
05:17:55.703000 NaN NaN
05:17:55.703000 891 12
05:17:55.703000 891 13
05:17:55.703000 891 15
05:17:55.703000 891 16
05:17:55.703000 891 17
05:17:55.703000 891 18
05:17:55.707000 892 0
05:17:55.707000 892 1
05:17:55.707000 892 5
05:17:55.707000 892 6
05:17:55.707000 892 7
05:17:55.708000 NaN NaN
05:17:55.711000 892 10
05:17:55.711000 892 11
05:17:55.711000 892 12
05:17:55.723000 893 11
05:17:55.723000 893 15
05:17:55.723000 893 16
05:17:55.726000 NaN NaN
需要创建两个新列,如果当前列为NaN
,则基于以下逻辑将起作用
+-----------------+-------+-------+----------+----------+----------------------------------------+
| Time | Col01 | Col02 | Col01new | Col02new | |
+-----------------+-------+-------+----------+----------+----------------------------------------+
| 05:17:55.703000 | NaN | NaN | 891 | 12 | if NaN & first row, fill from next row |
| 05:17:55.703000 | 891 | 12 | 891 | 12 | |
| 05:17:55.703000 | 891 | 13 | 891 | 13 | |
| 05:17:55.703000 | 891 | 15 | 891 | 15 | |
| 05:17:55.703000 | 891 | 16 | 891 | 16 | |
| 05:17:55.703000 | 891 | 17 | 891 | 17 | |
| 05:17:55.703000 | 891 | 18 | 891 | 18 | |
| 05:17:55.707000 | 892 | 0 | 892 | 0 | |
| 05:17:55.707000 | 892 | 1 | 892 | 1 | |
| 05:17:55.707000 | 892 | 5 | 892 | 5 | |
| 05:17:55.707000 | 892 | 6 | 892 | 6 | |
| 05:17:55.707000 | 892 | 7 | 892 | 7 | |
| 05:17:55.708000 | NaN | NaN | 892 | 7 | if NaN fill from previous row |
| 05:17:55.711000 | 892 | 10 | 892 | 10 | |
| 05:17:55.711000 | 892 | 11 | 892 | 11 | |
| 05:17:55.711000 | 892 | 12 | 892 | 12 | |
| 05:17:55.723000 | 893 | 11 | 893 | 11 | |
| 05:17:55.723000 | 893 | 15 | 893 | 15 | |
| 05:17:55.723000 | 893 | 16 | 893 | 16 | |
| 05:17:55.726000 | NaN | NaN | 893 | 16 | if NaN fill from previous row |
+-----------------+-------+-------+----------+----------+----------------------------------------+
答案 0 :(得分:1)
以正确的顺序填充,先向前然后向后填充(如果为空,则仅获取第一行)。
pd.concat([df, df[['Col01', 'Col02']].ffill().bfill(downcast='infer').add_suffix('new')], axis=1)
Time Col01 Col02 Col01new Col02new
0 05:17:55.703000 NaN NaN 891 12
1 05:17:55.703000 891.0 12.0 891 12
2 05:17:55.703000 891.0 13.0 891 13
3 05:17:55.703000 891.0 15.0 891 15
4 05:17:55.703000 891.0 16.0 891 16
5 05:17:55.703000 891.0 17.0 891 17
6 05:17:55.703000 891.0 18.0 891 18
7 05:17:55.707000 892.0 0.0 892 0
8 05:17:55.707000 892.0 1.0 892 1
9 05:17:55.707000 892.0 5.0 892 5
10 05:17:55.707000 892.0 6.0 892 6
11 05:17:55.707000 892.0 7.0 892 7
12 05:17:55.708000 NaN NaN 892 7
13 05:17:55.711000 892.0 10.0 892 10
14 05:17:55.711000 892.0 11.0 892 11
15 05:17:55.711000 892.0 12.0 892 12
16 05:17:55.723000 893.0 11.0 893 11
17 05:17:55.723000 893.0 15.0 893 15
18 05:17:55.723000 893.0 16.0 893 16
19 05:17:55.726000 NaN NaN 893 16
答案 1 :(得分:0)
这也将起作用
df.ffill(axis=0).bfill(axis=0)
如果您想要单独的列,则可以在执行此操作之前先复制列