我正在关注这篇精彩的文章,并学习销售数据分析和预测。
我在4.3.4.1部分。使用Vanilla LSTM配置进行长期预测。 (请使用上面的链接并滚动到4.3.4.1。对不起,我试图创建指向该确切部分的锚点链接,但失败了)这里的预测只是对未来的一步,而我正在尝试对未来进行4步预测,这意味着预测4周的销售量。我已经编辑了原始代码,并将“ n_steps_in,n_steps_out”添加到了split_sequence函数中。
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from sklearn.preprocessing import MinMaxScaler
def split_sequence(sequence, n_steps_in, n_steps_out):
X, y = list(), list()
for i in range(len(sequence)):
# find the end of this pattern
end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out
# check if we are beyond the sequence
if out_end_ix > len(sequence):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
size = int(len(df) - 50)
n_steps_in = 5
n_steps_out = 4
n_features = 1
并如下修改原始代码,
df=pd.read_csv('data/salesweekly.csv')
subplotindex=0
numrows=4
numcols=2
fig, ax = plt.subplots(numrows, numcols, figsize=(18,15))
plt.subplots_adjust(wspace=0.1, hspace=0.3)
warnings.filterwarnings("ignore")
r=['M01AB','M01AE','N02BA','N02BE','N05B','N05C','R03','R06']
for x in r:
rowindex=math.floor(subplotindex/numcols)
colindex=subplotindex-(rowindex*numcols)
X=df[x].values
scaler = MinMaxScaler(feature_range = (0, 1))
X=scaler.fit_transform(X.reshape(-1, 1))
# split into samples
X_train,y_train=split_sequence(X[0:size], n_steps_in, n_steps_out)
X_test,y_test=split_sequence(X[size:len(df)], n_steps_in, n_steps_out)
# reshape from [samples, timesteps] into [samples, timesteps, features]
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], n_features))
# define model
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_steps_out))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(X_train, y_train, epochs=400, verbose=0)
X_test = X_test.reshape((len(X_test), n_steps_in, n_features))
predictions = model.predict(X_test, verbose=0)
y_test=scaler.inverse_transform(y_test)
predictions = scaler.inverse_transform(predictions)
error = mean_squared_error(y_test, predictions)
perror = mean_absolute_percentage_error(y_test, predictions)
resultsLongtermdf.loc['Vanilla LSTM MSE',x]=error
resultsLongtermdf.loc['Vanilla LSTM MAPE',x]=perror
ax[rowindex,colindex].set_title(x+' (MSE=' + str(round(error,2))+', MAPE='+ str(round(perror,2)) +'%)')
ax[rowindex,colindex].legend(['Real', 'Predicted'], loc='upper left')
ax[rowindex,colindex].plot(y_test)
ax[rowindex,colindex].plot(predictions, color='red')
subplotindex=subplotindex+1
plt.show()
我分别添加了“ n_steps_in”和“ n_steps_out”,并将模型安装到其中。
但是出现错误:
ValueError:找到的数组为暗3。估计值应为<= 2。
我被困在这里,并且我寻找的结果与原始文章相似,但是预测部分将在未来的4周内进行,而不是仅仅一个星期。
有人可以帮忙吗? 非常感谢。