Keras RNN损失不会超过纪元

时间:2016-09-03 17:08:28

标签: machine-learning neural-network deep-learning keras recurrent-neural-network

我使用Keras构建了一个RNN。 RNN用于解决回归问题:

def RNN_keras(feat_num, timestep_num=100):
    model = Sequential()
    model.add(BatchNormalization(input_shape=(timestep_num, feat_num)))
    model.add(LSTM(input_shape=(timestep_num, feat_num), output_dim=512, activation='relu', return_sequences=True))
    model.add(BatchNormalization())  
    model.add(LSTM(output_dim=128, activation='relu', return_sequences=True))
    model.add(BatchNormalization())
    model.add(TimeDistributed(Dense(output_dim=1, activation='relu'))) # sequence labeling

    rmsprop = RMSprop(lr=0.00001, rho=0.9, epsilon=1e-08)
    model.compile(loss='mean_squared_error',
                  optimizer=rmsprop,
                  metrics=['mean_squared_error'])
    return model

整个过程看起来很好。但是损失在时代上保持不变。

61267 in the training set
6808 in the test set

Building training input vectors ...
888 unique feature names
The length of each vector will be 888
Using TensorFlow backend.

Build model...

# Each batch has 1280 examples
# The training data are shuffled at the beginning of each epoch.

****** Iterating over each batch of the training data ******
Epoch 1/3 : Batch 1/48 | loss = 11011073.000000 | root_mean_squared_error = 3318.232910
Epoch 1/3 : Batch 2/48 | loss = 620.271667 | root_mean_squared_error = 24.904161
Epoch 1/3 : Batch 3/48 | loss = 620.068665 | root_mean_squared_error = 24.900017
......
Epoch 1/3 : Batch 47/48 | loss = 618.046448 | root_mean_squared_error = 24.859678
Epoch 1/3 : Batch 48/48 | loss = 652.977051 | root_mean_squared_error = 25.552946
****** Epoch 1: RMSD(training) = 24.897174 

Epoch 2/3 : Batch 1/48 | loss = 607.372620 | root_mean_squared_error = 24.644049
Epoch 2/3 : Batch 2/48 | loss = 599.667786 | root_mean_squared_error = 24.487448
Epoch 2/3 : Batch 3/48 | loss = 621.368103 | root_mean_squared_error = 24.926300
......
Epoch 2/3 : Batch 47/48 | loss = 620.133667 | root_mean_squared_error = 24.901398
Epoch 2/3 : Batch 48/48 | loss = 639.971924 | root_mean_squared_error = 25.297264
****** Epoch 2: RMSD(training) = 24.897174 

Epoch 3/3 : Batch 1/48 | loss = 651.519836 | root_mean_squared_error = 25.523636
Epoch 3/3 : Batch 2/48 | loss = 673.582581 | root_mean_squared_error = 25.952084
Epoch 3/3 : Batch 3/48 | loss = 613.930054 | root_mean_squared_error = 24.776562
......
Epoch 3/3 : Batch 47/48 | loss = 624.460327 | root_mean_squared_error = 24.988203
Epoch 3/3 : Batch 48/48 | loss = 629.544250 | root_mean_squared_error = 25.090448
****** Epoch 3: RMSD(training) = 24.897174 

我认为这不正常。我想念一下吗?

更新: 我发现在所有时代之后所有预测总是为零。这就是为什么所有RMSD都是相同的原因,因为预测都是相同的,即0.我检查了训练y。它只包含几个零。所以这不是由于数据不平衡造成的。

所以现在我在想是否是因为我正在使用的图层和激活。

2 个答案:

答案 0 :(得分:0)

您的RNN功能似乎没问题。

损失减少的速度取决于优化者和学习率。

你如何使用衰减率0.9。尝试更高的学习率,任何如何以0.9率降低。

尝试使用不同学习率的其他优化器 keras提供的其他优化器:https://keras.io/optimizers/

很多时候,一些优化器在某些数据集上运行良好,而有些可能会失败。

答案 1 :(得分:0)

由于您使用RNN进行回归问题(不是用于分类),因此您应该在最后一层使用“线性”激活。

在您的代码中,

model.add(TimeDistributed(Dense(output_dim=1, activation='relu'))) # sequence labeling 

更改为activation='linear'而不是'relu'

如果不起作用,请删除第二层中的activation='relu'

rmsprop的学习率通常在0.1到0.0001之间。