为多元时间序列预测设置 LSTM 模型时遇到问题

时间:2021-03-05 14:19:42

标签: python time-series lstm tensorflow2.0 forecasting

我对用于多变量预测的 LSTM 数量有疑问。

我正在处理的数据集: Weather Dataset (kaggle)

df_final = df.loc['2011-01-01':'2014-10-31']
df_final['Temperature (C)'].plot(figsize=(28,6))

Temperature Plot:

最后我想预测温度。(其他参数也是,但主要是温度)

数据有每小时读数。

# How many rows per month?
rows_per_months=24*30

test_months = 12 #number of months we want to predict in the future.

test_indices = test_months*720
test_indices

# train and test split:
train = df_final.iloc[:-test_indices]

# Choose the variable/parameter you want to predict
test = df_final.iloc[-test_indices:]

len(train)
#[op]: 24960

scaler = MinMaxScaler()

scaled_train = scaler.fit_transform(train)
scaled_test = scaler.transform(test)

#define generator:
length =  rows_per_months#Length of output sequences (in number of timesteps)
batch_size = 30 #Number of timeseries sample in batch
generator = tf.keras.preprocessing.sequence.TimeseriesGenerator(scaled_train,scaled_train,length=length,batch_size=batch_size)

定义模型

# define model
model = Sequential()

model.add(tf.keras.layers.LSTM(50, input_shape=(length,scaled_train.shape[1])))
#NOTE: Do not specify the activation function for LSTM layers, this is because it will not run on GPU.
model.add(Dense(scaled_train.shape[1]))

model.compile(optimizer='adam', loss='mse')

训练后,

Loss graph

在测试数据上评估模型:

first_eval_batch = scaled_train[-length:]
first_eval_batch.shape

first_eval_batch = first_eval_batch.reshape((1,length,scaled_train.shape[1]))

n_features = scaled_test.shape[1] #n_features = scaled_train.shape[1] =250 (for predicting all parameters in the next time stamp)
# print(n_features) = 1 #Since we are only predicting temperature.
test_predictions = []

first_eval_batch = scaled_train[-length:]
current_batch = first_eval_batch.reshape((1, length, n_features))
print(current_batch.shape)

#output:(1, 720, 4)

for i in range(len(test)):
    #Get prediction 1 time stamp ahead
    current_pred = model.predict(current_batch)[0]
    #store prediction
    test_predictions.append(current_pred)

    #update the current batch to now include the prediction and drop the first value.
    current_batch = np.append(current_batch[:,1:,:],[[current_pred]],axis=1)

true_predictions = pd.DataFrame(data=true_predictions,columns=test.columns,index=test.index)

true_predictions = scaler.inverse_transform(test_predictions)

结果数据框:

result_df = pd.concat([test['Temperature (C)'], true_predictions['Temperature (C)']],axis=1)

result_df.plot(figsize=(28,8))

Prediction of temperature for test data

由于模型无法预测测试数据的正确值,我无法进一步进行预测。

  • 我该如何解决这个问题?

附注: 我曾尝试在图层中使用层数和 lstm 单元数,但似乎没有任何效果。我还尝试使用不同的激活函数,例如 relu,但完成 1 个 epoch 需要 20 多个小时,因为除非 LSTM 具有默认函数(即 < em>tanh)

0 个答案:

没有答案