如何使用ML预测python中的时间序列数据

时间:2017-05-09 03:06:07

标签: python python-2.7 python-3.x

我正在构建时间序列预测模型。

我的数据集是这样的:

job,date,maxsal,minsal
Engineer,2001-01,1137,578  
Engineer,2001-02,1187,519
Engineer,2001-03,1131,546 
Engineer,2001-04,1049,604
Engineer,2001-05,1129,579 
Engineer,2001-06,1133,563

代码是:

model = ARIMA(series, order=(1,1,0))
model_fit = model.fit(disp=0)
print(model_fit.summary())  
# plot residual errors
residuals = DataFrame(model_fit.resid)
residuals.plot()
pyplot.show()
residuals.plot(kind='kde')
pyplot.show()
print(residuals.describe())

这会引发以下错误:

model_fit = model.fit(disp=0)
  File "/usr/lib/python2.7/dist-packages/statsmodels/tsa/arima_model.py", line 1104, in fit
    callback, **kwargs)
  File "/usr/lib/python2.7/dist-packages/statsmodels/tsa/arima_model.py", line 919, in fit
    start_params = self._fit_start_params((k_ar, k_ma, k), method)
  File "/usr/lib/python2.7/dist-packages/statsmodels/tsa/arima_model.py", line 556, in _fit_start_params
    start_params = self._fit_start_params_hr(order)
  File "/usr/lib/python2.7/dist-packages/statsmodels/tsa/arima_model.py", line 493, in _fit_start_params_hr
    endog -= np.dot(exog, ols_params).squeeze()
TypeError: Cannot cast ufunc subtract output from dtype('float64') to dtype('int64') with casting rule 'same_kind'

如果我将年份作为输入,我试图预测minsalmaxsal是什么。我需要在图表中绘制它。

Here I Uploaded file

有人可以帮助我吗?

1 个答案:

答案 0 :(得分:1)

您需要将minsal和maxsal列转换为float64类型。

您只需使用numpy

即可
import numpy as np

series['maxsal']= series['maxsal'].astype(np.float64)
series['minsal']= series['minsal'].astype(np.float64)

在调用ARIMA之前添加这两行

model = ARIMA(series, order=(1,1,0))

即使我遇到了这个问题,但就我而言,只有一列要转换。希望它也适合你。

在某些情况下,您也可以直接这样做。

series = series.astype(np.float64)