我正在训练LSTM变体PhasedLSTM进行回归。我正在使用tensorflow.contrib.rnn.PhasedLSTMCell,除了功能之外,它还希望有一个时间戳向量。这是我的模型定义:
from tensorflow.keras.layers import Dense, RNN, Input
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
from tensorflow.contrib.rnn import PhasedLSTMCell
hidden_size = 128
num_features = 217
num_timesteps = 2016
input_layer_timestamps = Input(shape=(num_timesteps, 1), name='time_input')
input_layer_features = Input(shape=(num_timesteps, num_features) name='data_input')
timed_inputs = (input_layer_timestamps, input_layer_features)
P_cell = PhasedLSTMCell(hidden_size)
PLSTM_layer = RNN(P_cell, return_sequences=False, name='Phased_LSTM_1')(timed_inputs)
output_layer = Dense(2, activation=None)(PLSTM_layer_1)
model = Model(inputs = [input_layer_timestamps, input_layer_features],
outputs = [output_layer])
lstm_optimizer = Adam(lr=Adam_lr, clipnorm=5.)
model.compile(optimizer=lstm_optimizer,
loss='mse')
model.summary()
模型可以很好地编译和训练。验证结果似乎不错,但有合理的错误。但是,上一个代码片段的model.summary()输出为:
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
time_input (InputLayer) (None, 2016, 1) 0
__________________________________________________________________________________________________
data_input (InputLayer) (None, 2016, 217) 0
__________________________________________________________________________________________________
Phased_LSTM_1 (RNN) (None, 128) 0 time_input[0][0]
data_input[0][0]
__________________________________________________________________________________________________
dense_1 (Dense) (None, 2) 66 Phased_LSTM_1[0][0]
==================================================================================================
Total params: 258 Trainable params: 258 Non-trainable params: 0
__________________________________________________________________________________________________
尤其是Phased_LSTM_1的可训练参数数量为0
如果随后调用了model.weights(),则它根本不显示LSTM部分:
ipdb> NN.weights
[<tf.Variable 'scope/dense/kernel:0' shape=(128, 2) dtype=float32>, <tf.Variable 'scope/dense/bias:0' shape=(2,) dtype=float32>]
此外,如果随后使用model.save()保存模型并使用任一模型加载 tensorflow.keras.models.load_model或定义模型-> model.load(),模型的行为将完全不同。相当好的验证准确性将变为仅预测噪声的模型,即,看起来原始的相当不错的回归模型已从RNN变为确实看起来像具有错误激活的线性激活的128神经元单层NN东西。
我有三个问题:
1)发生了什么事?这是错误还是我在滥用Keras RNN API?
2)如何验证LSTM是否得到正确培训?我想像通常使用model.weights和model.get_weights()一样检查图层
3)如何保存和加载该模型而不破坏它?我的输入和输出管道及其周围的模块是为Keras构建的,因此,我宁愿不要将NN定义转换为“基本”张量流。
我正在使用带有Keras 2.2.4-tf的tensorflow-gpu 1.13.1