根据python pandas数据框中其他列的值计算新列

时间:2019-09-05 06:12:56

标签: python pandas

我想根据pandas数据框中其他列的值创建一个新列。我的数据是关于一辆从装卸地点到卸货地点来回移动的卡车。我要计算当前路段到最后一段的距离。数据示例如下所示:

State      | segment length | 
-----------------------------
Loaded     |    20          |
Loaded     |    10          |
Loaded     |    10          |
Empty      |    15          |
Empty      |    10          |
Empty      |    10          |
Loaded     |    30          |
Loaded     |    20          |
Loaded     |    10          |

因此,道路的尽头将是国家变化的记录。因此,我想计算到道路尽头的距离。最终的数据帧将是:

State   | segment length | Distance to end
Loaded  |       20       |     40
Loaded  |       10       |     20
Loaded  |       10       |     10
Empty   |       15       |     35
Empty   |       10       |     20
Empty   |       10       |     10
Loaded  |       30       |     60
Loaded  |       20       |     30
Loaded  |       10       |     10

有人可以帮忙吗? 预先谢谢

2 个答案:

答案 0 :(得分:4)

GroupBy.cumsumDataFrame.iloc一起用于交换顺序,并定制Seriesshiftcumsum获得唯一的连续组:

g = df['State'].ne(df['State'].shift()).cumsum()
df['Distance to end'] = df.iloc[::-1].groupby(g)['segment length'].cumsum()
print (df)
    State  segment length  Distance to end
0  Loaded              20               40
1  Loaded              10               20
2  Loaded              10               10
3   Empty              15               35
4   Empty              10               20
5   Empty              10               10
6  Loaded              30               60
7  Loaded              20               30
8  Loaded              10               10

详细信息

print (g)
0    1
1    1
2    1
3    2
4    2
5    2
6    3
7    3
8    3
Name: State, dtype: int32

答案 1 :(得分:0)

df['Distance to end'] = (
    df.assign(i=df.State.ne(df.State.shift()).cumsum())
    .assign(s=lambda x: x.groupby(by='i')['segment length'].transform(sum))
    .groupby(by='i')
    .apply(lambda x: x.s.sub(x['segment length'].shift().cumsum().fillna(0)))
    .values
)

    State   segment length  Distance to end
0   Loaded  20              40.0
1   Loaded  10              20.0
2   Loaded  10              10.0
3   Empty   15              35.0
4   Empty   10              20.0
5   Empty   10              10.0
6   Loaded  30              60.0
7   Loaded  20              30.0
8   Loaded  10              10.0