Python / Pandas-在培训/测试完成后,如何修正实际预测的ARIMA代码

时间:2018-11-29 11:07:27

标签: python pandas forecasting arima

我的数据集如下:

date           bookings
2017-01-01     438
2017-01-02     167
...
2017-12-31     45
2018-01-01     748
...
2018-11-29     223

我需要这样的东西(即超出数据集的预测数据):

date           bookings
2017-01-01     438
2017-01-02     167
...
2017-12-31     45
2018-01-01     748
...
2018-11-29     223
2018-11-30     98
...
2018-12-30     73
2018-12-31     100

到目前为止使用的代码(培训/测试阶段):

import pandas as pd
import statsmodels.api as sm
# from statsmodels.tsa.arima_model import ARIMA
# from sklearn.metrics import mean_squared_error

import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import matplotlib
matplotlib.rcParams['axes.labelsize'] = 14
matplotlib.rcParams['xtick.labelsize'] = 12
matplotlib.rcParams['ytick.labelsize'] = 12
matplotlib.rcParams['text.color'] = 'k'

df = pd.read_csv('data.csv',names = ["date","bookings"],index_col=0)
df.index = pd.to_datetime(df.index)

X = df.values
size = int(len(X) * 0.66)
train, test = X[0:size], X[size:len(X)]
history = [x for x in train]
predictions = list()
for t in range(len(test)): 
    model = ARIMA(history, order=(1,1,0))
    model_fit = model.fit(disp=0)
    output = model_fit.forecast()

    yhat = output[0]
    predictions.append(yhat) 

    obs = test[t]
    history.append(obs)

    #   print('predicted=%f, expected=%f' % (yhat, obs))
#error = mean_squared_error(test, predictions)
#print(error)
#print('Test MSE: %.3f' % error)
# plot
plt.figure(num=None, figsize=(15, 8))
plt.plot(test)
plt.plot(predictions, color='red')
plt.show()

将结果导出到csv:

df_forecast = pd.DataFrame(predictions)
df_test = pd.DataFrame(test)
result = pd.merge(df_test, df_forecast, left_index=True, right_index=True)
result.rename(columns = {'0_x': 'Test', '0_y': 'Forecast'}, inplace=True)

我遇到的麻烦是正在使用该模型来生成超出我的数据集的预测。我该如何修改代码以使其经过培训才能生成预测?我有点理解,我需要将迭代扩展到某个结束日期,但是这样做时,我得到的结果很糟糕...

如何从必须创建的预测(而不是测试)转变为可导出到CSV的内容?

我尝试做的事情(失败了):

# APPLICATION PHASE ATTEMPT
fc_size =  len(pd.date_range(start='29/11/2018', end='31/12/2018'))

for i in range(fc_size):
    model = ARIMA(history, order=(1,1,0))
    model_fit = model.fit(disp=0)
    output = model_fit.forecast()
    yhat = output[0]
    predictions.append(yhat)
    obs = predictions[t]
    history.append(obs)
#   print('predicted=%f, expected=%f' % (yhat, obs))
# error = mean_squared_error(test, predictions)
# print('Test MSE: %.3f' % error)

# plot
plt.figure(num=None, figsize=(15, 8))
plt.plot(test)
plt.plot(predictions, color='red')
plt.show()

任何帮助将不胜感激。

0 个答案:

没有答案