需要使用SARIMA模型改善Python现金流量预测的帮助

时间:2020-11-03 07:18:06

标签: python-3.x machine-learning time-series forecasting arima

我正在使用SARIMAX模型在Python中建立每周现金流量预测,但对结果不满意。我正在使用自动机来查找SARIMA的最佳订单和季节性订单。我有过去5年以上的数据,足以建立一个好的模型。我的数据看起来像是附件Historical Data 分解为freq = 7的结果如下statsmodel decompostion

最佳模型:SARIMAX(1,0,1)(2,1,0)[52]

Forecast Result 我们的预测的均方误差为 4625364095.19 我们的预测的均方根误差为 68010.03

Prediction Steps=25

RMSE太高,因此需要寻求帮助以改善模型性能。快速响应。

我的代码如下:

actual = [35592.63, 111814.61, 164527.43, 136719.53, 130048.37, 66672.31, 151650.05, 98633.68, 218984.49, 32640.38, 119842.40, 114052.16, 78411.80]
dt = pd.date_range("20140113","20200608", freq='W-MON')
df2 = pd.read_csv('mse_ar_data.csv')
df2.index=dt
df2 = np.ceil(df2)
df2
stepwise_fit = auto_arima(df2, start_p = 1, start_q = 1, 
                          max_p = 5, max_q = 5, m = 52, 
                          start_P = 0, seasonal = True, 
                          d = None, D = 1, trace = True, 
                          error_action ='ignore',   # we don't want to know if an order does not work 
                          suppress_warnings = True,  # we don't want convergence warnings 
                          stepwise = True)

stepwise_fit.summary()
model = sm.tsa.statespace.SARIMAX(df2,
                                order=(1, 0, 1),
                                seasonal_order=(2, 1, 0, 52),
                                enforce_stationarity=False,
                                enforce_invertibility=False)
results_ar = model.fit()
print(results_ar.summary().tables[1])

#Diagnostic Plot
results_ar.plot_diagnostics(figsize=(16, 8))
plt.show()

#Prediction
pred_ar = results_ar.get_prediction(start=pd.to_datetime('2020-03-02'), dynamic=False)
pred_ar_ci = pred_ar.conf_int()
ax = df2['2016-01':].plot(label='observed')
pred_ar.predicted_mean.plot(ax=ax, label='One-step ahead Forecast', alpha=.7, figsize=(14, 7))
ax.fill_between(pred_ar_ci.index,
                pred_ar_ci.iloc[:, 0],
                pred_ar_ci.iloc[:, 1], color='k', alpha=.2)
ax.set_xlabel('Date')
ax.set_ylabel('Cash Inflow AR')
plt.legend()
plt.show()

y_forecasted =  pred_ar.predicted_mean
y_truth = df2['2020-03-02':]['ar_amount']
mse = ((y_forecasted - y_truth) ** 2).mean()
print('\nThe Mean Squared Error of our forecasts is {}'.format(round(mse, 2)))
print('The Root Mean Squared Error of our forecasts is {}'.format(round(np.sqrt(mse), 2)))
forcast_ar = pd.DataFrame({'Actual':actual, 'Forecasted':pred_ar_uc.predicted_mean})
forcast_ar = forcast_ar.round(2)
forcast_ar['Delta'] = forcast_ar['Forecasted']-forcast_ar['Actual']
print(forcast_ar)

total_delta = round(np.abs(forcast_ar.Delta).sum(),2)
avg_delta = round(np.abs(forcast_ar.Delta).mean(),2)
print('\nTotal Delta:',total_delta)
print('Average Delta:',avg_delta)

0 个答案:

没有答案