我正在尝试为时间序列数据创建LSTM自动编码器。我的目标是使用由自动编码器生成的这些序列的编码表示形式,并对它们进行分类。所有序列的长度确实很大(大约1M),我有9个这样的序列。因此,我将使用拟合生成器和LSTM来合并我的数据。但是,我的损失函数没有收敛,对于LSTM自动编码器,准确性确实很差。这是我的代码的一部分。
encoder_input = Input(shape=(None, input_dim))
encoder_output = LSTM(latent_space)(encoder_input)
decoder_input = Lambda(repeat_vector, output_shape=(None, latent_space))([encoder_output, encoder_input])
decoder_out = LSTM(latent_space, return_sequences=True)(decoder_input)
decoder_output = TimeDistributed(Dense(input_dim))(decoder_out)
autoencoder = Model(encoder_input, decoder_output)
encoder = Model(encoder_input, encoder_output)
autoencoder.compile(optimizer="Adam", loss="mse", metrics=["accuracy"])
def generator(X_data, window_size, step_size):
size_data = X_data.shape[0]
# print('size:',size_data)
while 1:
for k in range(0, size_data - window_size, step_size):
x_batch = X_data[k:k + window_size, :]
x_batch = x_batch.reshape(1,x_batch.shape[0], x_batch.shape[1])
# print(x_batch.shape)
y_batch = x_batch
# print("i = " + str(i))
yield x_batch, y_batch
# print('value: ', x_batch, y_batch)
sliding_window_size = 200
step_size_of_sliding_window = 200
n_epochs = 50
for epoch in range(n_epochs):
print("At Iteration: " + str(epoch))
losses = []
csv_logger = CSVLogger('log.csv', append=True, separator=';')
loss = autoencoder.fit_generator(generator(X_tr,sliding_window_size, step_size_of_sliding_window),steps_per_epoch=shots_train,epochs=1,verbose=2,callbacks=[csv_logger])
losses.append(loss.history['loss'])
我简要地得到以下结果:
Epoch 1/1
- 6s - loss: 0.0319 - acc: 0.5425
At Iteration: 6
Epoch 1/1
- 6s - loss: 0.0326 - acc: 0.4555
At Iteration: 7
Epoch 1/1
- 6s - loss: 0.0301 - acc: 0.4865
At Iteration: 8
Epoch 1/1
- 6s - loss: 0.0304 - acc: 0.5020
At Iteration: 9
Epoch 1/1
- 6s - loss: 0.0313 - acc: 0.4780
At Iteration: 10
Epoch 1/1
- 6s - loss: 0.0313 - acc: 0.5090
At Iteration: 11
Epoch 1/1
- 6s - loss: 0.0313 - acc: 0.4675
准确性确实很差,重建的序列也不同。我该如何改善呢?我已经尝试过在LSTM中使用不同的代码大小。