我对用于多变量预测的 LSTM 数量有疑问。
我正在处理的数据集: Weather Dataset (kaggle)
df_final = df.loc['2011-01-01':'2014-10-31']
df_final['Temperature (C)'].plot(figsize=(28,6))
最后我想预测温度。(其他参数也是,但主要是温度)
数据有每小时读数。
# How many rows per month?
rows_per_months=24*30
test_months = 12 #number of months we want to predict in the future.
test_indices = test_months*720
test_indices
# train and test split:
train = df_final.iloc[:-test_indices]
# Choose the variable/parameter you want to predict
test = df_final.iloc[-test_indices:]
len(train)
#[op]: 24960
scaler = MinMaxScaler()
scaled_train = scaler.fit_transform(train)
scaled_test = scaler.transform(test)
#define generator:
length = rows_per_months#Length of output sequences (in number of timesteps)
batch_size = 30 #Number of timeseries sample in batch
generator = tf.keras.preprocessing.sequence.TimeseriesGenerator(scaled_train,scaled_train,length=length,batch_size=batch_size)
定义模型
# define model
model = Sequential()
model.add(tf.keras.layers.LSTM(50, input_shape=(length,scaled_train.shape[1])))
#NOTE: Do not specify the activation function for LSTM layers, this is because it will not run on GPU.
model.add(Dense(scaled_train.shape[1]))
model.compile(optimizer='adam', loss='mse')
训练后,
在测试数据上评估模型:
first_eval_batch = scaled_train[-length:]
first_eval_batch.shape
first_eval_batch = first_eval_batch.reshape((1,length,scaled_train.shape[1]))
n_features = scaled_test.shape[1] #n_features = scaled_train.shape[1] =250 (for predicting all parameters in the next time stamp)
# print(n_features) = 1 #Since we are only predicting temperature.
test_predictions = []
first_eval_batch = scaled_train[-length:]
current_batch = first_eval_batch.reshape((1, length, n_features))
print(current_batch.shape)
#output:(1, 720, 4)
for i in range(len(test)):
#Get prediction 1 time stamp ahead
current_pred = model.predict(current_batch)[0]
#store prediction
test_predictions.append(current_pred)
#update the current batch to now include the prediction and drop the first value.
current_batch = np.append(current_batch[:,1:,:],[[current_pred]],axis=1)
true_predictions = pd.DataFrame(data=true_predictions,columns=test.columns,index=test.index)
true_predictions = scaler.inverse_transform(test_predictions)
结果数据框:
result_df = pd.concat([test['Temperature (C)'], true_predictions['Temperature (C)']],axis=1)
result_df.plot(figsize=(28,8))
由于模型无法预测测试数据的正确值,我无法进一步进行预测。
附注: 我曾尝试在图层中使用层数和 lstm 单元数,但似乎没有任何效果。我还尝试使用不同的激活函数,例如 relu,但完成 1 个 epoch 需要 20 多个小时,因为除非 LSTM 具有默认函数(即 < em>tanh)