Question

我有以下pandas数据帧：

import pandas as pd

data = {"first_name": ["Alexander", "Alan", "Heather", "Marion", "Amy", "John"],
            "last_name": ["Miller", "Jacobson", ".", "Milner", "Cooze", "Smith"],
            "age": [42, 52, 36, 24, 73, 19],
                "marriage_status" : [0, 0, 1, 1, 0, 1]}

df = pd.DataFrame(data)
df

  age first_name last_name  marriage_status
0   42  Alexander    Miller                0
1   52       Alan  Jacobson                0
2   36    Heather         .                1
3   24     Marion    Milner                1
4   73        Amy     Cooze                0
5   19       John     Smith                1
....

列marriage_status是一列二进制数据，0和1.在每个1之前，我想将前一行设为1。在此示例中，数据框将变为：

  age first_name last_name  marriage_status
0   42  Alexander    Miller                0
1   52       Alan  Jacobson                1   # this changed to 1
2   36    Heather         .                1
3   24     Marion    Milner                1
4   73        Amy     Cooze                1   # this changed to 1
5   19       John     Smith                1
....

换句话说，有＆＃34;组＆＃34;在这一列中的连续的，我想要使前面的行元素1而不是0.我怎么能这样做？

我的想法是以某种方式创建一个for语句，但这不是一个基于熊猫的解决方案。也可以尝试enumerate()，但是我需要将前面的值设为1;没有添加，我不确定这是如何工作的。

Answer 1

您可以使用Series.shift(-1)方法：

In [21]: df.loc[df.marriage_status.shift(-1) == 1, 'marriage_status'] = 1

In [22]: df
Out[22]:
   age first_name last_name  marriage_status
0   42  Alexander    Miller                0
1   52       Alan  Jacobson                1
2   36    Heather         .                1
3   24     Marion    Milner                1
4   73        Amy     Cooze                1
5   19       John     Smith                1

Answer 2

我们可以使用or运算符|。它会将1的{{1}}和True视为0。当我们在一行中有False而在下一行有|时评估为False的{{1}}。

给定pandas数据帧中的二进制列，我如何将前面的0更改为1？

2 个答案: