根据行值迭代地重新计算熊猫列的值

时间:2019-05-21 13:06:03

标签: python pandas

我有一个熊猫数据框import pandas as pd df = pd.DataFrame({'item':[1,1,1,1,1,1,2,2,2,2,2,2], 'date':['2017-03-27','2017-04-03','2017-04-10','2017-04-17','2017-04-24','2017-05-01', '2017-03-27','2017-04-03','2017-04-10','2017-04-17','2017-04-24','2017-05-01'], 'sls':[3,4,5,3,2,3,5,6,10,4,5,2], 'prc':[0,2,0,1,1,7,2,4,0,1,1,1], 'stk':[7,0,0,0,0,0,12,0,0,0,0,0]})


    item        date  sls  prc  stk
0      1  2017-03-27    3    0    7
1      1  2017-04-03    4    2    0
2      1  2017-04-10    5    0    0
3      1  2017-04-17    3    1    0
4      1  2017-04-24    2    1    0
5      1  2017-05-01    3    7    0
6      2  2017-03-27    5    2   12
7      2  2017-04-03    6    4    0
8      2  2017-04-10   10    0    0
9      2  2017-04-17    4    1    0
10     2  2017-04-24    5    1    0
11     2  2017-05-01    2    1    0

如下所示:

stk

除了每个item组的第一条记录外,我想计算列def f(g): g.stk = (g.stk.shift() + g.prc - g.sls).cumsum() return g df['stock'] = df.stk.replace(0, df.groupby('item').apply(f).stk) 的值。

我创建了另一列具有计算值的库存:


    item        date  sls  prc  stk  stock
0      1  2017-03-27    3    0    7      7
1      1  2017-04-03    4    2    0      5
2      1  2017-04-10    5    0    0      0
3      1  2017-04-17    3    1    0     -2
4      1  2017-04-24    2    1    0     -3
5      1  2017-05-01    3    7    0      1
6      2  2017-03-27    5    2   12     12
7      2  2017-04-03    6    4    0     10
8      2  2017-04-10   10    0    0      0
9      2  2017-04-17    4    1    0     -3
10     2  2017-04-24    5    1    0     -7
11     2  2017-05-01    2    1    0     -8

所以我更新的数据框变为:

stock

但是我不希望item列中的值为负。因此,我如何进行迭代计算,如果对于stock组,如果stock列中的记录中存在负值,则必须将该数字加到第一条记录中的值上,然后再次添加做计算。直到没有更多的负值。

item date sls prc stk stock 0 1 2017-03-27 3 0 7 10 1 1 2017-04-03 4 2 0 8 2 1 2017-04-10 5 0 0 3 3 1 2017-04-17 3 1 0 1 4 1 2017-04-24 2 1 0 0 5 1 2017-05-01 3 7 0 4 6 2 2017-03-27 5 2 12 20 7 2 2017-04-03 6 4 0 18 8 2 2017-04-10 10 0 0 8 9 2 2017-04-17 4 1 0 5 10 2 2017-04-24 5 1 0 1 11 2 2017-05-01 2 1 0 0 列的计算方式为at(stk-1)的值-sls的值+ prc的值

我的预期输出如下:

{{1}}

如何在熊猫中做同样的事情?

2 个答案:

答案 0 :(得分:1)

快速解决方案:

df['stock'] -= (df.groupby('item').stock
                  .transform(lambda x: x.min() if x.min()<0 else 0)

答案 1 :(得分:0)

照原样进行操作,然后将最小负值添加到列中

df = pd.DataFrame({'item':[1,1,1,1,1,1,2,2,2,2,2,2],
               'date':['2017-03-27','2017-04-03','2017-04-10','2017-04-17','2017-04-24','2017-05-01', '2017-03-27','2017-04-03','2017-04-10','2017-04-17','2017-04-24','2017-05-01'],
               'sls':[3,4,5,3,2,3,5,6,10,4,5,2],
               'prc':[0,2,0,1,1,7,2,4,0,1,1,1],
               'stk':[7,0,0,0,0,0,12,0,0,0,0,0]})

def f(g):
    g.stk = (g.stk.shift() + g.prc - g.sls).cumsum()
    return g

df['stock'] = df.stk.replace(0, df.groupby('item').apply(f).stk)
df['stock'] = df.groupby('item')['stock'].apply(lambda x: x - x.min() if x.min() < 0 else 0)