许多天,我遇到使用LSTM模型进行评估的结果与拟合模型不同的问题。在细节上,我在Rstudio中运行以下代码:
seed<-42
use_session_with_seed(seed)
network<-keras_model_sequential() %>%
layer_lstm(units = 50,input_shape = InputShape,batch_size = batch_size,
return_sequences = TRUE, stateful = TRUE) %>%
layer_lstm(units = 50,return_sequences = FALSE, stateful = TRUE) %>%
layer_dense(units = 3, activation = "softmax")
network %>% compile(
optimizer = "rmsprop",
loss = "categorical_crossentropy",
metrics = c("accuracy")
)
epochs<-2
cat('Training\n')
for (i in 1:epochs) {
network %>% fit(Train, Train_labels, batch_size = batch_size,
epochs = 1, verbose = 1, shuffle = FALSE,seed=seed,
validation_data=list(Train,Train_labels))
network %>% reset_states()
}
我得到结果:
Train on 6498 samples, validate on 6498 samples
Epoch 1/1
6498/6498 [==============================] - 439s 68ms/step - loss: 0.7279 -
acc: 0.7081 - val_loss: 2.5372 - val_acc: 0.3307
Train on 6498 samples, validate on 6498 samples
Epoch 1/1
6498/6498 [==============================] - 448s 69ms/step - loss: 0.6757 -
acc: 0.7419 - val_loss: 2.7820 - val_acc: 0.3350
作为以上结果,看到了我提到的问题。实际上,我调查了许多论坛以获取更多信息,并且我发现问题可能是由FrançoisChollet在this提及的BatchNormalization引起的。另外,his blog中报告的Vasilis Vryniotis首先确定了由BatchNormalization引起的问题。但是,他的博客指出的问题是卷积网络而不是LSTM。因此,我希望有人能给我提示我提到的LSTM问题。实际上,我尝试通过为两个layer_lstm添加“ trainable = FALSE”并为layer_desen添加“ trainable = TRUE”来更改代码。但是,我得到的准确度估计为0.33。有关更多信息,Train和Train_labels描述为:
str(Train)
num [1:6498, 1:24, 1:18] 0.08 0.0865 0.1579 0.299 0.1846 ...
str(Train_labels)
num [1:6498, 1:3] 1 0 0 0 0 0 0 0 0 1 ...
Train_labels由一键编码,包含0、1、2和3类。请给我帮助。