我下面有一个数据框。我正在尝试通过计算为var1,var2,var3创建滞后 (var_n / lag2(var_n))-2(其中n是1,2,3)
以下代码对lag2正常工作。但是我需要按“ grp”分组进行计算
代码:
lag = [2]
df = pd.concat([df] + [df.groupby('grp'['var1','var2','var3']。shift(x).add_prefix('lag'+ str(x) ),用于x滞后],轴= 1)
我在下面尝试了另一种方法,但无法按以下方式申请分组:
yoy = [12]
columns_y = df.loc [:,'var1':'var3']
对于columns_y.columns中的列:
for x in yoy:
columns_y.loc[:,col+"_yoy"] =(columns_y[col]/(columns_y[col].shift(x)))-1
答案 0 :(得分:0)
尝试一下
df = pd.DataFrame({
'grp':['a','a','a','b','b','b'],'abc2':['l','m','n','p','q','r'], 'abc3':['x','y','z','a','b','c'],
'var1':[20,30,20,40,50,90],'var2':[50,80,70,20,30,40],'var3':[50,80,70,20,30,40]})
lag = [2]
lags_df = pd.concat([
df.groupby('grp')[[f'var{i+1}' for i in range(3)]]
.shift(x)
.add_prefix(f'lag{x}_')
for x in lag
], axis=1)
print(pd.concat([df, lags_df], axis=1))
otuputs
grp abc2 abc3 var1 var2 var3 lag2_var1 lag2_var2 lag2_var3
0 a l x 20 50 50 NaN NaN NaN
1 a m y 30 80 80 NaN NaN NaN
2 a n z 20 70 70 20.0 50.0 50.0
3 b p a 40 20 20 NaN NaN NaN
4 b q b 50 30 30 NaN NaN NaN
5 b r c 90 40 40 40.0 20.0 20.0