我参考了How to create rolling percentage for groupby DataFrame
import pandas as pd
data = [
('product_a','1/31/2014',53)
,('product_b','1/31/2014',44)
,('product_c','1/31/2014',36)
,('product_a','11/30/2013',52)
,('product_b','11/30/2013',43)
,('product_c','11/30/2013',35)
,('product_a','3/31/2014',50)
,('product_b','3/31/2014',41)
,('product_c','3/31/2014',34)
,('product_a','12/31/2013',50)
,('product_b','12/31/2013',41)
,('product_c','12/31/2013',34)
,('product_a','2/28/2014',52)
,('product_b','2/28/2014',43)
,('product_c','2/28/2014',35)]
product_df = pd.DataFrame( data, columns=['prod_desc','activity_month','prod_count'] )
product_df.sort_values('activity_month', inplace = True, ascending=False)
product_df['pct_ch'] = product_df.groupby('prod_desc')['prod_count'].pct_change() + 1
print(product_df)
但是,我无法像建议的答案那样产生输出。
产生的答案
prod_desc activity_month prod_count pct_ch
0 product_a 1/31/2014 53 NaN
1 product_b 1/31/2014 44 0.830189
2 product_c 1/31/2014 36 0.818182
3 product_a 11/30/2013 52 1.444444
4 product_b 11/30/2013 43 0.826923
5 product_c 11/30/2013 35 0.813953
9 product_a 12/31/2013 50 1.428571
10 product_b 12/31/2013 41 0.820000
11 product_c 12/31/2013 34 0.829268
12 product_a 2/28/2014 52 1.529412
13 product_b 2/28/2014 43 0.826923
14 product_c 2/28/2014 35 0.813953
6 product_a 3/31/2014 50 1.428571
7 product_b 3/31/2014 41 0.820000
8 product_c 3/31/2014 34 0.829268
预期答案应类似于以下内容,应为每个prod_desc(product_a,product_b和product_c)计算百分比变化,而不是仅计算一列
product_desc activity_month prod_count pct_ch
0 product_a 2014-01-01 53 NaN
3 product_a 2014-02-01 26 0.490566
6 product_a 2014-03-01 41 1.576923
1 product_b 2014-01-01 42 NaN
4 product_b 2014-02-01 48 1.142857
7 product_b 2014-03-01 35 0.729167
2 product_c 2014-01-01 38 NaN
5 product_c 2014-02-01 39 1.026316
8 product_c 2014-03-01 50 1.282051
提前谢谢
答案 0 :(得分:2)
将GroupBy.apply
与Series.pct_change
一起使用:
product_df['activity_month'] = pd.to_datetime(product_df['activity_month'])
product_df.sort_values(['prod_desc','activity_month'], inplace = True, ascending=[True, False])
product_df['pct_ch'] = (product_df.groupby('prod_desc')['prod_count']
.apply(pd.Series.pct_change) + 1)
print(product_df)
prod_desc activity_month prod_count pct_ch
6 product_a 2014-03-31 50 NaN
12 product_a 2014-02-28 52 1.040000
0 product_a 2014-01-31 53 1.019231
9 product_a 2013-12-31 50 0.943396
3 product_a 2013-11-30 52 1.040000
7 product_b 2014-03-31 41 NaN
13 product_b 2014-02-28 43 1.048780
1 product_b 2014-01-31 44 1.023256
10 product_b 2013-12-31 41 0.931818
4 product_b 2013-11-30 43 1.048780
8 product_c 2014-03-31 34 NaN
14 product_c 2014-02-28 35 1.029412
2 product_c 2014-01-31 36 1.028571
11 product_c 2013-12-31 34 0.944444
5 product_c 2013-11-30 35 1.029412
答案 1 :(得分:0)
万一期间,您可以使用以下代码:
product_df['pct_ch'] = (product_df.groupby('prod_desc')['prod_count']
.apply(lambda dfi : dfi.pct_change(periods=126)) + 1)