熊猫:用另一列的值替换零值

时间:2017-05-30 10:08:28

标签: python pandas

如何用列中前一行值为零的另一列的同一行中的值替换列中的零值,即仅在尚未遇到非零的情况下替换? 例如:给定一个包含a列,bc的数据框:

+----+-----+-----+----+
|    |   a |   b |  c |
|----+-----+-----|----|
|  0 |   2 |   0 |  0 |
|  1 |   5 |   0 |  0 |
|  2 |   3 |   4 |  0 |
|  3 |   2 |   0 |  3 |
|  4 |   1 |   8 |  1 |
+----+-----+-----+----+

bc中的零值替换为前一个值为零的a

+----+-----+-----+----+
|    |   a |   b |  c |
|----+-----+-----|----|
|  0 |   2 |   2 |  2 |
|  1 |   5 |   5 |  5 |
|  2 |   3 |   4 |  3 |
|  3 |   2 |   0 |  3 | <-- zero in this row is not replaced because of  
|  4 |   1 |   8 |  1 |     non-zero value (4) in row before it.
+----+-----+-----+----+

1 个答案:

答案 0 :(得分:1)

In [90]: (df[~df.apply(lambda c: c.eq(0) & c.shift().fillna(0).eq(0))]
    ...:    .fillna(pd.DataFrame(np.tile(df.a.values[:, None], df.shape[1]),
    ...:                         columns=df.columns, index=df.index))
    ...:    .astype(int)
    ...: )
Out[90]:
   a  b  c
0  2  2  2
1  5  5  5
2  3  4  3
3  2  0  3
4  1  8  1

说明:

In [91]: df[~df.apply(lambda c: c.eq(0) & c.shift().fillna(0).eq(0))]
Out[91]:
   a    b    c
0  2  NaN  NaN
1  5  NaN  NaN
2  3  4.0  NaN
3  2  0.0  3.0
4  1  8.0  1.0

现在我们可以使用下面DF中的相应值填充NaN(它构建为3个连接的a列):

In [92]: pd.DataFrame(np.tile(df.a.values[:, None], df.shape[1]), columns=df.columns, index=df.index)
Out[92]:
   a  b  c
0  2  2  2
1  5  5  5
2  3  3  3
3  2  2  2
4  1  1  1