我在sales_training.csv
中有如下销售数据-
time_period sales
1 127
2 253
3 123
4 253
5 157
6 105
7 244
8 157
9 130
10 221
11 132
12 265
我想添加包含移动平均值的第三列。我的代码-
import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
for i in periods:
if i < period:
df['forecast'][i] = i
else:
for j in range(period):
sum1 += df['sales'][i-j]
df['forecast'][i] = sum1/period
sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")
这是在KeyError: 'forecast'
行中给出df['forecast'][i] = i
。有什么问题吗?
答案 0 :(得分:1)
一个简单的解决方案,只需df['forecast'] = df['sales']
import pandas as pd
df = pd.read_csv("./Sales_training.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
df['forecast'] = df['sales'] # add one line
for i in periods:
if i < period:
df['forecast'][i] = i
else:
for j in range(period):
sum1 += df['sales'][i-j]
df['forecast'][i] = sum1/period
sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")
答案 1 :(得分:1)
由于引用“ forecast”的列值的方式不正确,您的代码给出了“ keyerror”。由于您的代码是第一次运行,因此尚未创建“ forecast”列,并且它试图引用df {{3} },然后给出关键错误。
在这里,我们的任务是更新动态创建的称为“预测”的新列中的值。因此,您可以编写df.at [i,'forecast']代替df ['forecast'] [i]。
代码中还有另一个问题。当i的值小于句点时,您将分配'i'以进行预测,据我所知这是不正确的。在这种情况下它不应显示任何内容。
这是我的更正代码版本:
import pandas as pd
df = pd.read_csv("./sales.csv", index_col="time_period")
periods = df.index.tolist()
period = int(input("Enter a period for the moving average :"))
sum1 = 0
for i in periods:
print(i)
if i < period:
df.at[i,'forecast'] = ''
else:
for j in range(period):
sum1 += df['sales'][i-j]
df['forecast'][i] = sum1/period
sum1 = 0
print(df)
df.to_csv("./forecast_mannual.csv")
我输入period = 2来计算移动平均值时的输出:
希望这会有所帮助。