有没有办法在不生成虚拟行的情况下实现它?
这是我的数据来源。
Group Store Month Revenue
Group1 A 201611 10
Group1 A 201612 20
Group1 A 201701 30
Group1 B 201611 40
Group1 B 201701 60
Group2 C 201611 70
Group2 C 201612 80
Group2 C 201702 100
这是期望的输出
Group Store Month Revenue Month_LM Revenue_LM
Group1 A 201611 10 201610
Group1 A 201612 20 201611 10
Group1 A 201701 30 201612 20
Group1 B 201611 50 201610
Group1 B 201701 70 201612
Group1 B 201702 80 201701 70
Group2 C 201611 90 201610
Group2 C 201612 100 201611 90
Group2 C 201702 120 201701
问题出在B,C(请注意缺少B的201612和C的201701) 如果我做shift(),我将获得上个月的价值(在交易中,但不在业务逻辑中)
我设法通过
获得Month_LMdef get_lm(month):
d = datetime.strptime(month+"01","%Y%m%d")
d = d - relativedelta(months=1)
return d.strftime("%Y%m")
df['LM'] = df['MONTH'].apply(lambda x:get_lm(str(x)))
但我不知道如何获得"月"基于" Month_LM"价值? df.lookup也许?
谢谢。
答案 0 :(得分:0)
我将月份更改为日期时间格式,您希望将其更改为df.Month.dt.year*100+df.Month.dt.month
,并且在我的解决方案中我没有使用列Month_LM
df.Month=pd.to_datetime(df.Month,format='%Y%m')
df['Rev']=df.groupby('Group').apply(lambda x :x.Revenue.shift()* (x.Month.dt.year*12+x.Month.dt.month).diff().eq(1)).replace(0,np.nan).values
df
Out[1080]:
Group Store Month Revenue Month_LM Rev
0 Group1 A 2016-11-01 10 201610 NaN
1 Group1 A 2016-12-01 20 201611 10.0
2 Group1 A 2017-01-01 30 201612 20.0
3 Group1 B 2016-11-01 50 201610 NaN
4 Group1 B 2017-01-01 70 201612 NaN
5 Group1 B 2017-02-01 80 201701 70.0
6 Group2 C 2016-11-01 90 201610 NaN
7 Group2 C 2016-12-01 100 201611 90.0
8 Group2 C 2017-02-01 120 201701 NaN