这是一个数据框:
id value next_value step
0 10764 612486.33921 800000.0 0.0
1 10764 612486.33921 1000000.0 200000.0
2 10822 97604.30520 100000.0 0.0
3 10823 97805.45851 100000.0 0.0
4 10823 97805.45851 165000.0 65000.0
5 10823 97805.45851 250000.0 85000.0
6 10823 97805.45851 400000.0 150000.0
7 10823 97805.45851 600000.0 200000.0
8 10823 97805.45851 800000.0 200000.0
9 10823 97805.45851 1000000.0 200000.0
10 10823 97805.45851 1200000.0 200000.0
11 10882 33843.33701 40000.0 0.0
12 10882 33843.33701 60000.0 20000.0
13 10882 33843.33701 90000.0 30000.0
14 10882 33843.33701 125000.0 35000.0
15 10882 33843.33701 150000.0 25000.0
16 10882 33843.33701 175000.0 25000.0
17 10882 33843.33701 205000.0 30000.0
18 10882 33843.33701 230000.0 25000.0
19 10882 33843.33701 300000.0 70000.0
20 10882 33843.33701 440000.0 140000.0
21 10758 241890.94525 385000.0 0.0
22 10817 138519.73560 150000.0 0.0
23 10817 138519.73560 200000.0 50000.0
24 10817 138519.73560 250000.0 50000.0
25 10817 138519.73560 300000.0 50000.0
26 10817 138519.73560 350000.0 50000.0
27 10817 138519.73560 400000.0 50000.0
28 10859 73140.30048 80000.0 0.0
29 10859 73140.30048 100000.0 20000.0
我需要将step
和next_value
在value
组中的第一个差异加到id
上。
即对于id=10764
step = step + 800000.0 - 612486.33921
,
id=10822
step = step + 100000.0 - 97604.30520
,
等
我已经尝试过了:
df['step'] += df.groupby('id')[['next_value', 'value']].apply
(lambda x: x.iloc[0, 0] - x.iloc[0, 1])
但这给了我所有等于step
的{{1}}值。
我在这里可以做什么?
答案 0 :(得分:2)
使用groupby
id
并获得first
值。然后,对s
使用df.step
In [717]: s = df.groupby('id').transform('first')
In [718]: df.step + s.next_value - s.value
Out[718]:
0 187513.66079
1 387513.66079
2 2395.69480
3 2194.54149
4 67194.54149
5 87194.54149
6 152194.54149
7 202194.54149
8 202194.54149
9 202194.54149
10 202194.54149
11 6156.66299
12 26156.66299
13 36156.66299
14 41156.66299
15 31156.66299
16 31156.66299
17 36156.66299
18 31156.66299
19 76156.66299
20 146156.66299
21 143109.05475
22 11480.26440
23 61480.26440
24 61480.26440
25 61480.26440
26 61480.26440
27 61480.26440
28 6859.69952
29 26859.69952
dtype: float64
详细信息
In [719]: s
Out[719]:
value next_value step
0 612486.33921 800000.0 0.0
1 612486.33921 800000.0 0.0
2 97604.30520 100000.0 0.0
3 97805.45851 100000.0 0.0
4 97805.45851 100000.0 0.0
5 97805.45851 100000.0 0.0
6 97805.45851 100000.0 0.0
7 97805.45851 100000.0 0.0
8 97805.45851 100000.0 0.0
9 97805.45851 100000.0 0.0
10 97805.45851 100000.0 0.0
11 33843.33701 40000.0 0.0
12 33843.33701 40000.0 0.0
13 33843.33701 40000.0 0.0
14 33843.33701 40000.0 0.0
15 33843.33701 40000.0 0.0
16 33843.33701 40000.0 0.0
17 33843.33701 40000.0 0.0
18 33843.33701 40000.0 0.0
19 33843.33701 40000.0 0.0
20 33843.33701 40000.0 0.0
21 241890.94525 385000.0 0.0
22 138519.73560 150000.0 0.0
23 138519.73560 150000.0 0.0
24 138519.73560 150000.0 0.0
25 138519.73560 150000.0 0.0
26 138519.73560 150000.0 0.0
27 138519.73560 150000.0 0.0
28 73140.30048 80000.0 0.0
29 73140.30048 80000.0 0.0