我正在集中精力建立LSTM模型,它在同一课程中训练和测试都很好。我在保存和在另一个会话中加载模型时遇到问题。
问题1)当我用model.save('my_model.h5')
保存模型时,我得到一个奇怪的警告:
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/engine/network.py:872: UserWarning: Layer lstm_1 was passed non-serializable keyword arguments: {'initial_state': [<tf.Tensor 's0:0' shape=(?, 128) dtype=float32>, <tf.Tensor 'c0:0' shape=(?, 128) dtype=float32>]}. They will not be included in the serialized model (and thus will be missing at deserialization time).
'. They will not be included '
问题2)在向模型加载model = load_model('my_model.h5')
后,在测试时会产生非常不准确的结果。
我尝试用model.save_weights
保存权重,并用mode.load_weights
重新加载权重,但无济于事。
这是怎么回事?
更新:
def model(Tx, Ty, n_a, n_s, human_vocab_size, machine_vocab_size):
X = Input(shape=(Tx, human_vocab_size))
s0 = Input(shape=(n_s,), name='s0')
c0 = Input(shape=(n_s,), name='c0')
s = s0
c = c0
# Initialize empty list of outputs
outputs = []
### START CODE HERE ###
# Step 1: Define your pre-attention Bi-LSTM. Remember to use return_sequences=True. (≈ 1 line)
a = Bidirectional(LSTM(n_a, return_sequences=True))(X)
# Step 2: Iterate for Ty steps
for t in range(Ty):
# Step 2.A: Perform one step of the attention mechanism to get back the context vector at step t (≈ 1 line)
context = one_step_attention(a, s)
# Step 2.B: Apply the post-attention LSTM cell to the "context" vector.
# Don't forget to pass: initial_state = [hidden state, cell state] (≈ 1 line)
s, _, c = post_activation_LSTM_cell(context, initial_state = [s, c])
# Step 2.C: Apply Dense layer to the hidden state output of the post-attention LSTM (≈ 1 line)
out = output_layer(s)
# Step 2.D: Append "out" to the "outputs" list (≈ 1 line)
outputs.append(out)
# Step 3: Create model instance taking three inputs and returning the list of outputs. (≈ 1 line)
model = Model(inputs = [X, s0, c0], outputs = outputs)
### END CODE HERE ###
return model
答案 0 :(得分:0)
github上有一个issue描述了这个问题。
在这里,它是通过描述以不同方式保存的模型来解决的。在这篇文章中,他们描述了可能是因为您保存了一个状态,该状态仅在训练期间才知道,但是在加载时无法重构。这是不会被存储并且出现问题1.的时间。这导致问题2。