我正在尝试将LSTM模型构建为时间序列预测的另一个模型。我的熊猫数据集dfTop50由25列组成:Product_Code,M0,M1,...,M23。有50行。
下面的代码导致错误:
melt = dfTop50.melt(id_vars='Product_Code', var_name='Month', value_name='Sales')
melt['Product_Code'] = melt['Product_Code'].astype(str)
melt['Month'] = melt['Month'].str.extract('(\d+)', expand=False).astype(int)
melt5 = melt.copy()
melt5['Last_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift()
melt5['Last_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last_Month_Sales'].diff()
melt5['Last-1_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(2)
melt5['Last-1_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-1_Month_Sales'].diff()
melt5['Last-2_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(3)
melt5['Last-2_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-2_Month_Sales'].diff()
melt5['Last-3_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(4)
melt5['Last-3_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-3_Month_Sales'].diff()
melt5['Last-4_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(5)
melt5['Last-4_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-4_Month_Sales'].diff()
melt5['Last-5_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(6)
melt5['Last-5_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-5_Month_Sales'].diff()
melt5['Last-6_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(7)
melt5['Last-6_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-6_Month_Sales'].diff()
melt5['Last-7_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(8)
melt5['Last-7_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-7_Month_Sales'].diff()
melt5['Last-8_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(9)
melt5['Last-8_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-8_Month_Sales'].diff()
melt5['Last-9_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(10)
melt5['Last-9_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-9_Month_Sales'].diff()
melt5['Last-10_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(11)
melt5['Last-10_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-10_Month_Sales'].diff()
melt5['Last-11_Month_Sales'] = melt5.groupby(['Product_Code'])['Sales'].shift(12)
melt5['Last-11_Month_Diffs'] = melt5.groupby(['Product_Code'])['Last-11_Month_Sales'].diff()
melt5 = melt5.dropna()
melt5 = melt5.drop(['Product_Code'], axis=1)
for month in range(18,19):
train = melt5[melt5['Month'] < month]
val = melt5[melt5['Month'] == month]
xtr, xts = train.drop(['Sales'], axis=1), val.drop(['Sales'], axis=1)
ytr, yts = train['Sales'].values, val['Sales'].values
X_train = np.asmatrix(xtr)
Y_train = np.asmatrix(ytr)
X_test = np.asmatrix(xts)
Y_test = np.asmatrix(yts)
X_train_lmse = X_train.reshape(X_train.shape[0], X_train.shape[1], 1)
X_test_lmse = X_test.reshape(X_test.shape[0], X_test.shape[1], 1)
lstm_model = Sequential()
lstm_model.add(LSTM(7, input_shape=(1, X_train_lmse.shape[1]), activation='relu', kernel_initializer='lecun_uniform', return_sequences=False))
lstm_model.add(Dense(1))
lstm_model.compile(loss='mean_absolute_error', optimizer='adam')
early_stop = EarlyStopping(monitor='loss', patience=2, verbose=1)
history_lstm_model = lstm_model.fit(X_train_lmse, Y_train, epochs=100, batch_size=1, verbose=1, shuffle=False, callbacks=[early_stop])
nn_predictions_lstm = np.squeeze(model.predict(X_test_lmse))
errorlstm = mean_absolute_percentage_error(Y_test, nn_predictions_lstm)
print('Month %d - NN Error %.1f' % (month,errorlstm))
我收到错误消息:“预期lstm_input具有3个维,但是得到了形状为(250,25)的数组。
但是,以下用于ANN的代码确实有效:
for month in range(18,19):
train = melt5[melt5['Month'] < month]
val = melt5[melt5['Month'] == month]
xtr, xts = train.drop(['Sales'], axis=1), val.drop(['Sales'], axis=1)
ytr, yts = train['Sales'].values, val['Sales'].values
EPOCHS = 100
BATCH_SIZE = 10
#Defining the 3 layered Neural Network
def build_model():
model = keras.Sequential([
keras.layers.Dense(250, activation=tf.nn.softplus,
input_shape=(xtr.shape[1],)),
keras.layers.Dense(250, activation=tf.nn.softplus),
keras.layers.Dense(1)
])
#Optimize the absolute error (prefer to squared error)
model.compile(loss='mae',optimizer='adam', metrics=['mae'])
return model
model = build_model()
model.summary()
# Store training stats
history = model.fit(xtr, ytr, epochs=EPOCHS, batch_size=BATCH_SIZE,
validation_split=0.0, verbose=0)
nn_predictions0 = np.squeeze(model.predict(xts))
error2 = mean_absolute_percentage_error(yts, nn_predictions0)
print('Month %d - NN Error %.1f' % (month,error2))