Question

我正在集中精力建立LSTM模型，它在同一课程中训练和测试都很好。我在保存和在另一个会话中加载模型时遇到问题。

问题1）当我用model.save('my_model.h5')保存模型时，我得到一个奇怪的警告：

/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/keras/engine/network.py:872: UserWarning: Layer lstm_1 was passed non-serializable keyword arguments: {'initial_state': [<tf.Tensor 's0:0' shape=(?, 128) dtype=float32>, <tf.Tensor 'c0:0' shape=(?, 128) dtype=float32>]}. They will not be included in the serialized model (and thus will be missing at deserialization time).
  '. They will not be included '

问题2）在向模型加载model = load_model('my_model.h5')后，在测试时会产生非常不准确的结果。

我尝试用model.save_weights保存权重，并用mode.load_weights重新加载权重，但无济于事。

这是怎么回事？

更新：

def model(Tx, Ty, n_a, n_s, human_vocab_size, machine_vocab_size):

    X = Input(shape=(Tx, human_vocab_size))
    s0 = Input(shape=(n_s,), name='s0')
    c0 = Input(shape=(n_s,), name='c0')
    s = s0
    c = c0

    # Initialize empty list of outputs
    outputs = []

    ### START CODE HERE ###

    # Step 1: Define your pre-attention Bi-LSTM. Remember to use return_sequences=True. (≈ 1 line)
    a = Bidirectional(LSTM(n_a, return_sequences=True))(X)

    # Step 2: Iterate for Ty steps
    for t in range(Ty):

        # Step 2.A: Perform one step of the attention mechanism to get back the context vector at step t (≈ 1 line)
        context = one_step_attention(a, s)

        # Step 2.B: Apply the post-attention LSTM cell to the "context" vector.
        # Don't forget to pass: initial_state = [hidden state, cell state] (≈ 1 line)
        s, _, c = post_activation_LSTM_cell(context, initial_state = [s, c])

        # Step 2.C: Apply Dense layer to the hidden state output of the post-attention LSTM (≈ 1 line)
        out = output_layer(s)

        # Step 2.D: Append "out" to the "outputs" list (≈ 1 line)
        outputs.append(out)

    # Step 3: Create model instance taking three inputs and returning the list of outputs. (≈ 1 line)
    model = Model(inputs = [X, s0, c0], outputs = outputs)

    ### END CODE HERE ###

    return model

Answer 1

github上有一个issue描述了这个问题。

在这里，它是通过描述以不同方式保存的模型来解决的。在这篇文章中，他们描述了可能是因为您保存了一个状态，该状态仅在训练期间才知道，但是在加载时无法重构。这是不会被存储并且出现问题1.的时间。这导致问题2。

LSTM模型加载产生奇怪的结果，不可序列化的关键字参数错误

1 个答案: