熊猫-如何创建一个新列,该列从上一行或下一行(如果是第一行)的列中获取值

时间:2019-09-21 18:08:24

标签: python pandas dataframe

给出如下数据

Time    Col01   Col02
05:17:55.703000 NaN NaN
05:17:55.703000 891 12
05:17:55.703000 891 13
05:17:55.703000 891 15
05:17:55.703000 891 16
05:17:55.703000 891 17
05:17:55.703000 891 18
05:17:55.707000 892  0
05:17:55.707000 892  1
05:17:55.707000 892  5
05:17:55.707000 892  6
05:17:55.707000 892  7
05:17:55.708000 NaN  NaN
05:17:55.711000 892 10
05:17:55.711000 892 11
05:17:55.711000 892 12
05:17:55.723000 893 11
05:17:55.723000 893 15
05:17:55.723000 893 16
05:17:55.726000 NaN  NaN

需要创建两个新列,如果当前列为NaN,则基于以下逻辑将起作用

+-----------------+-------+-------+----------+----------+----------------------------------------+
|      Time       | Col01 | Col02 | Col01new | Col02new |                                        |
+-----------------+-------+-------+----------+----------+----------------------------------------+
| 05:17:55.703000 | NaN   | NaN   |      891 |       12 | if NaN & first row, fill from next row |
| 05:17:55.703000 | 891   | 12    |      891 |       12 |                                        |
| 05:17:55.703000 | 891   | 13    |      891 |       13 |                                        |
| 05:17:55.703000 | 891   | 15    |      891 |       15 |                                        |
| 05:17:55.703000 | 891   | 16    |      891 |       16 |                                        |
| 05:17:55.703000 | 891   | 17    |      891 |       17 |                                        |
| 05:17:55.703000 | 891   | 18    |      891 |       18 |                                        |
| 05:17:55.707000 | 892   |  0    |      892 |        0 |                                        |
| 05:17:55.707000 | 892   |  1    |      892 |        1 |                                        |
| 05:17:55.707000 | 892   |  5    |      892 |        5 |                                        |
| 05:17:55.707000 | 892   |  6    |      892 |        6 |                                        |
| 05:17:55.707000 | 892   |  7    |      892 |        7 |                                        |
| 05:17:55.708000 | NaN   |  NaN  |      892 |        7 | if NaN fill from previous row          |
| 05:17:55.711000 | 892   | 10    |      892 |       10 |                                        |
| 05:17:55.711000 | 892   | 11    |      892 |       11 |                                        |
| 05:17:55.711000 | 892   | 12    |      892 |       12 |                                        |
| 05:17:55.723000 | 893   | 11    |      893 |       11 |                                        |
| 05:17:55.723000 | 893   | 15    |      893 |       15 |                                        |
| 05:17:55.723000 | 893   | 16    |      893 |       16 |                                        |
| 05:17:55.726000 | NaN   |  NaN  |      893 |       16 | if NaN fill from previous row          |
+-----------------+-------+-------+----------+----------+----------------------------------------+

2 个答案:

答案 0 :(得分:1)

以正确的顺序填充,先向前然后向后填充(如果为空,则仅获取第一行)。

pd.concat([df, df[['Col01', 'Col02']].ffill().bfill(downcast='infer').add_suffix('new')], axis=1)

               Time  Col01  Col02  Col01new  Col02new
0   05:17:55.703000    NaN    NaN       891        12
1   05:17:55.703000  891.0   12.0       891        12
2   05:17:55.703000  891.0   13.0       891        13
3   05:17:55.703000  891.0   15.0       891        15
4   05:17:55.703000  891.0   16.0       891        16
5   05:17:55.703000  891.0   17.0       891        17
6   05:17:55.703000  891.0   18.0       891        18
7   05:17:55.707000  892.0    0.0       892         0
8   05:17:55.707000  892.0    1.0       892         1
9   05:17:55.707000  892.0    5.0       892         5
10  05:17:55.707000  892.0    6.0       892         6
11  05:17:55.707000  892.0    7.0       892         7
12  05:17:55.708000    NaN    NaN       892         7
13  05:17:55.711000  892.0   10.0       892        10
14  05:17:55.711000  892.0   11.0       892        11
15  05:17:55.711000  892.0   12.0       892        12
16  05:17:55.723000  893.0   11.0       893        11
17  05:17:55.723000  893.0   15.0       893        15
18  05:17:55.723000  893.0   16.0       893        16
19  05:17:55.726000    NaN    NaN       893        16

答案 1 :(得分:0)

这也将起作用

df.ffill(axis=0).bfill(axis=0)

如果您想要单独的列,则可以在执行此操作之前先复制列