我有一个数据框df:
ID DATE ACTUALVALUE B60 C60
1 10-05-2018 5 2 4
1 10-07-2018 8 3 2
1 10-14-2018 3 6 7
1 10-19-2018 6 5 3
1 10-22-2018 4 1 1
1 10-29-2018 5 5 5
2 10-06-2018 3 2 4
2 10-08-2018 8 3 5
我想通过ID创建一个新列PreviousDate
。我还想通过B60和C60上的ID创建移动平均线。因此,df应该看起来像:
ID DATE ACTUALVALUE B60 C60 PREVIOUSDATE MA B60 MA C60
1 10-05-2018 5 2 4 NA NA NA
1 10-07-2018 8 3 2 10-05-2018 NA NA
1 10-14-2018 3 6 7 10-07-2018 NA NA
1 10-19-2018 6 5 3 10-14-2018 NA NA
1 10-22-2018 4 1 1 10-19-2018 3.4 3.4
1 10-29-2018 5 5 5 10-22-2018 4 3.6
2 10-06-2018 3 2 4 NA NA NA
2 10-08-2018 8 3 5 10-06-2018 NA NA
我尝试:
df["PreviousDate"]=df.groupby("ID").DATE.shift(1)
Allid=df.unique()
for i in Allid:
df['MA B60'] = df['B60'].rolling(window=5,center=False).mean()
df['MA C60'] = df['C60'].rolling(window=5,center=False).mean()
但是,下一个ID的滚动平均值仍在继续,因此只有第一个ID获得4 NA。然后,无论ID是否更改,都将继续计算滚动平均值。
如果我尝试:
for i in Allid:
if i == df["id"]:
df['MA B60'] = df['B60'].rolling(window=5,center=False).mean()
df['MA C60'] = df['C60'].rolling(window=5,center=False).mean()
但我收到了以下消息:
"ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."
还要尝试:
df['MA B60'] = df.groupby("Id").B60.rolling(window=5,center=False).mean()
但是结果是:
"TypeError: incompatible index of inserted column with frame index"