Question

我正在研究一些时间序列，因此我需要使用python比较不同的方法。实际上，我需要使用三重指数平滑来生成一些预测，并且我正在使用this library和类似的函数this。我的时间序列具有这种格式，如pd.Series对象：

    Date Close
2016-04-11 01:17:04    -10.523793
2016-04-11 07:25:13     -5.352295
2016-04-11 22:40:11     92.556003
2016-04-13 05:06:31     -1.769866
2016-04-13 05:17:50     -2.330789
2016-04-14 08:43:09     17.636638
2016-04-17 21:15:12     -0.454655
2016-04-19 06:10:04     -0.026375
2016-04-19 06:10:04     -0.175647
...

我在python中写了以下几行：

from statsmodels.tsa.holtwinters import ExponentialSmoothing
from matplotlib import pyplot as plt
import numpy as np 
import pandas as pd
train_size = int(len(myTimeSeries) * 0.66)
train, test = myTimeSeries[1:train_size], myTimeSeries[train_size:]

model = ExponentialSmoothing(train)
model_fit = model.fit()
dict=model.params
params=np.array(list(dict.items()))
dates=test.index.astype(str)
pred = model.predict(params,start=dates[2], end=dates[-1])
plt.plot(train, label='Train')
plt.plot(test, label='Test')
plt.plot(pred, label='Holt-Winters')
plt.legend(loc='best')

我在函数model.predict上遇到了问题，因此我根据需要添加了参数值，并在其model之后从类fit获取了它们。我不确定自己是否做得很好，但是找不到太多around。此外，我在设置开始日期（也许还有结束）时遇到了问题。它返回以下内容：KeyError: 'The开始argument could not be matched to a location related to the index of the data.'正如我发现的here，我还将预测的开始日期移到了第三个值，即测试数据集的索引[2]。如果设置为[0]，[1]等，我也会得到相同的信息。

如您所见，myTimeSeries没有固定的频率集，但是该值的集合是随机的。我找到了不同的教程，例如this，this other one或this about theory，但是它们的条件不同：我不知道任何新闻（趋势，季节性变化等） ..）关于我的数据集。我没有发现任何违背的假设：如果我错了，请警告我。我曾经考虑过自己的理论指南，发现了here。而且，this post涵盖了类似的问题，但并不完全相同。

Answer 1

我想也许你只是想要

pred = model_fit.forecast(len(test))

您不能在此处使用日期来指定预测期，因为您的日期索引没有与之相关的频率，因此，您能做的最好的事情就是指出所需的预测数。

使用python中的statsmodels的ExponentialSmoothing进行三重指数平滑来预测

1 个答案: