如何在Keras中使用多层编码器解码器模型进行推理?

时间:2019-01-31 21:48:46

标签: python keras recurrent-neural-network autoencoder seq2seq

我在https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html博客之后紧随其后,开发了一个序列到序列模型以进行时间序列预测。

我尝试使用以下代码将模型扩展为多层模型:

encoder_inputs = Input(shape=(None, train_data.shape[2]), name='encoder_mask')
# Allows handling of variable length inputs by applying a binary mask to the specified mask_value.
masker = Masking(mask_value=mask_value)
masker(encoder_inputs)

# Define an input series and encode it with an LSTM.
if len(hidden_layers) == 1:
    encoder_inputs = Input(shape=(None, train_data.shape[2]), name='encoder')
    encoder = CuDNNLSTM(latent_dim, return_state=True)
    encoder_outputs, state_h, state_c = encoder(encoder_inputs)
else:
    units = hidden_layers[0]
    encoder_inputs = Input(shape=(None, train_data.shape[2]), name='encoder')
    encoder = CuDNNLSTM(units=units, return_sequences=True)(encoder_inputs)
    for units in hidden_layers[1:-1]:
        encoder = CuDNNLSTM(units, return_sequences=True)(encoder)
    units = hidden_layers[-1]
    encoder_outputs, state_h, state_c = CuDNNLSTM(units, return_state=True)(encoder)

# We discard `encoder_outputs` and only keep the final states. These represent the "context"
# vector that we use as the basis for decoding.
encoder_states = [state_h, state_c]

# We set up our decoder using `encoder_states` as initial state.
# We return full output sequences and return internal states as well.
# We don't use the return states in the training model, but we will use them in inference.
decoder_layers = []
if len(hidden_layers) == 1:
    # Set up the decoder, using `encoder_states` as initial state.
    # This is where teacher forcing inputs are fed in.
    decoder_inputs = Input(shape=(None, 1), name='decoder')
    decoder_lstm = CuDNNLSTM(hidden_layers[-1], return_sequences=True, return_state=True, name='decoder_first')
    decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
else:
    units = hidden_layers[-1]
    decoder_inputs = Input(shape=(None, 1), name='decoder')
    decoder_first = CuDNNLSTM(units, return_sequences=True, name='decoder_first')
    decoder = decoder_first(decoder_inputs, initial_state=encoder_states)

    for units in hidden_layers[::-1][1:-1]:
        decoder_hidden = CuDNNLSTM(units, return_sequences=True)
        decoder_layers.append(decoder_hidden)
        decoder = decoder_hidden(decoder)

    units = hidden_layers[0]
    decoder_last = CuDNNLSTM(units, return_sequences=True, return_state=True, name='decoder_last')
    decoder_outputs, _, _ = decoder_last(decoder)

decoder_dense = Dense(units=1, activation='linear')  # 1 continuous output at each timestep
decoder_outputs = decoder_dense(decoder_outputs)

我不确定如何继续创建新的推理模型,这样我的输入序列就可以拥有不同的预测长度。最初,我尝试仅将序列输入到解码器的第一层,就像博客文章中的示例一样。但是,这没有用,因为它不返回状态,最后一层可以。然后,我尝试将输入放到解码器的最后一层,但这没用,因为它期望使用3D形状而不是2D形状。然后我发现了这篇帖子Multilayer Seq2Seq model with LSTM in Keras,使我想到实际上我需要手动将序列通过所有解码器层馈送(因此,我将它们附加到列表中,我不知道这是否正确(最好)是否可以使用。)如下所示:

# from our previous model - mapping encoder sequence to state vectors
encoder_model = Model(encoder_inputs, encoder_states)

# A modified version of the decoding stage that takes in predicted target inputs
# and encoded state vectors, returning predicted target outputs and decoder state vectors.
# We need to hang onto these state vectors to run the next step of the inference loop.
decoder_state_input_h = Input(shape=(units,))
decoder_state_input_c = Input(shape=(units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

#decoder_outputs = decoder_first(decoder_inputs, initial_state=decoder_states_inputs)

decoder_outputs = decoder_first(decoder_inputs, initial_state=encoder_states)
decoder_outputs = decoder_layers[0](decoder_outputs)        
decoder_outputs, state_h, state_c = decoder_last(decoder_outputs)

decoder_states = [state_h, state_c]

decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)

但是,在执行完上述代码段之后,我现在得到以下错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-18-77cda21e84ad> in <module>
     19 
     20 decoder_outputs = decoder_dense(decoder_outputs)
---> 21 decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)
     22 
     23 train_mean = train_dict['train_mean']

/lib/python3.6/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
     89                 warnings.warn('Update your `' + object_name + '` call to the ' +
     90                               'Keras 2 API: ' + signature, stacklevel=2)
---> 91             return func(*args, **kwargs)
     92         wrapper._original_function = func
     93         return wrapper

/lib/python3.6/site-packages/keras/engine/network.py in __init__(self, *args, **kwargs)
     91                 'inputs' in kwargs and 'outputs' in kwargs):
     92             # Graph network
---> 93             self._init_graph_network(*args, **kwargs)
     94         else:
     95             # Subclassed network

/lib/python3.6/site-packages/keras/engine/network.py in _init_graph_network(self, inputs, outputs, name)
    229         # Keep track of the network's nodes and layers.
    230         nodes, nodes_by_depth, layers, layers_by_depth = _map_graph_network(
--> 231             self.inputs, self.outputs)
    232         self._network_nodes = nodes
    233         self._nodes_by_depth = nodes_by_depth

/lib/python3.6/site-packages/keras/engine/network.py in _map_graph_network(inputs, outputs)
   1441                                          'The following previous layers '
   1442                                          'were accessed without issue: ' +
-> 1443                                          str(layers_with_complete_input))
   1444                 for x in node.output_tensors:
   1445                     computable_tensors.append(x)

ValueError: Graph disconnected: cannot obtain value for tensor Tensor("encoder:0", shape=(?, ?, 10), dtype=float32) at layer "encoder". The following previous layers were accessed without issue: []

很明显,我在做错事。当我只有一个编码器和解码器层时,一切都会按预期进行。当我尝试具有许多解码器层时,我会遇到问题。我不完全了解如何设置用于推理的多层解码器。

0 个答案:

没有答案