我有一个数据框,需要从基线开始进行燃尽并减去所有值,本质上我正在寻找 DataFrame()。cumsum(0)的对面:
In Use
Baseline 3705.0
February 2018 0.0
March 2018 2.0
April 2018 15.0
May 2018 30.0
June 2018 14.0
July 2018 797.0
August 2018 1393.0
September 2018 86.0
October 2018 374.0
November 2018 21.0
December 2018 0.0
January 2019 0.0
February 2019 0.0
March 2019 0.0
April 2019 2.0
unknown 971.0
我找不到要执行的功能,或者找不到正确的标签/名称。
如何实现?
答案 0 :(得分:2)
按DataFrameGroupBy.diff
创建的组使用diff
,按lt
<
进行共同映射并累积总和:
g = df['Use'].diff().lt(0).cumsum()
df['new'] = df['Use'].groupby(g).diff().fillna(df['Use'])
print (df)
In Use new
0 Baseline 3705.0 3705.0
1 February 2018 0.0 0.0
2 March 2018 2.0 2.0
3 April 2018 15.0 13.0
4 May 2018 30.0 15.0
5 June 2018 14.0 14.0
6 July 2018 797.0 783.0
7 August 2018 1393.0 596.0
8 September 2018 86.0 86.0
9 October 2018 374.0 288.0
10 November 2018 21.0 21.0
11 December 2018 0.0 0.0
12 January 2019 0.0 0.0
13 February 2019 0.0 0.0
14 March 2019 0.0 0.0
15 April 2019 2.0 2.0
16 unknown 971.0 969.0
答案 1 :(得分:1)
您可以将pd.Series.diff
与fillna
一起使用。这是一个演示:
df = pd.DataFrame({'A': np.random.randint(0, 10, 5)})
df['B'] = df['A'].cumsum()
df['C'] = df['B'].diff().fillna(df['B']).astype(int)
print(df)
A B C
0 1 1 1
1 4 5 4
2 4 9 4
3 2 11 2
4 1 12 1