我在https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html博客之后紧随其后,开发了一个序列到序列模型以进行时间序列预测。
我尝试使用以下代码将模型扩展为多层模型:
encoder_inputs = Input(shape=(None, train_data.shape[2]), name='encoder_mask')
# Allows handling of variable length inputs by applying a binary mask to the specified mask_value.
masker = Masking(mask_value=mask_value)
masker(encoder_inputs)
# Define an input series and encode it with an LSTM.
if len(hidden_layers) == 1:
encoder_inputs = Input(shape=(None, train_data.shape[2]), name='encoder')
encoder = CuDNNLSTM(latent_dim, return_state=True)
encoder_outputs, state_h, state_c = encoder(encoder_inputs)
else:
units = hidden_layers[0]
encoder_inputs = Input(shape=(None, train_data.shape[2]), name='encoder')
encoder = CuDNNLSTM(units=units, return_sequences=True)(encoder_inputs)
for units in hidden_layers[1:-1]:
encoder = CuDNNLSTM(units, return_sequences=True)(encoder)
units = hidden_layers[-1]
encoder_outputs, state_h, state_c = CuDNNLSTM(units, return_state=True)(encoder)
# We discard `encoder_outputs` and only keep the final states. These represent the "context"
# vector that we use as the basis for decoding.
encoder_states = [state_h, state_c]
# We set up our decoder using `encoder_states` as initial state.
# We return full output sequences and return internal states as well.
# We don't use the return states in the training model, but we will use them in inference.
decoder_layers = []
if len(hidden_layers) == 1:
# Set up the decoder, using `encoder_states` as initial state.
# This is where teacher forcing inputs are fed in.
decoder_inputs = Input(shape=(None, 1), name='decoder')
decoder_lstm = CuDNNLSTM(hidden_layers[-1], return_sequences=True, return_state=True, name='decoder_first')
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)
else:
units = hidden_layers[-1]
decoder_inputs = Input(shape=(None, 1), name='decoder')
decoder_first = CuDNNLSTM(units, return_sequences=True, name='decoder_first')
decoder = decoder_first(decoder_inputs, initial_state=encoder_states)
for units in hidden_layers[::-1][1:-1]:
decoder_hidden = CuDNNLSTM(units, return_sequences=True)
decoder_layers.append(decoder_hidden)
decoder = decoder_hidden(decoder)
units = hidden_layers[0]
decoder_last = CuDNNLSTM(units, return_sequences=True, return_state=True, name='decoder_last')
decoder_outputs, _, _ = decoder_last(decoder)
decoder_dense = Dense(units=1, activation='linear') # 1 continuous output at each timestep
decoder_outputs = decoder_dense(decoder_outputs)
我不确定如何继续创建新的推理模型,这样我的输入序列就可以拥有不同的预测长度。最初,我尝试仅将序列输入到解码器的第一层,就像博客文章中的示例一样。但是,这没有用,因为它不返回状态,最后一层可以。然后,我尝试将输入放到解码器的最后一层,但这没用,因为它期望使用3D形状而不是2D形状。然后我发现了这篇帖子Multilayer Seq2Seq model with LSTM in Keras,使我想到实际上我需要手动将序列通过所有解码器层馈送(因此,我将它们附加到列表中,我不知道这是否正确(最好)是否可以使用。)如下所示:
# from our previous model - mapping encoder sequence to state vectors
encoder_model = Model(encoder_inputs, encoder_states)
# A modified version of the decoding stage that takes in predicted target inputs
# and encoded state vectors, returning predicted target outputs and decoder state vectors.
# We need to hang onto these state vectors to run the next step of the inference loop.
decoder_state_input_h = Input(shape=(units,))
decoder_state_input_c = Input(shape=(units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
#decoder_outputs = decoder_first(decoder_inputs, initial_state=decoder_states_inputs)
decoder_outputs = decoder_first(decoder_inputs, initial_state=encoder_states)
decoder_outputs = decoder_layers[0](decoder_outputs)
decoder_outputs, state_h, state_c = decoder_last(decoder_outputs)
decoder_states = [state_h, state_c]
decoder_outputs = decoder_dense(decoder_outputs)
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)
但是,在执行完上述代码段之后,我现在得到以下错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-18-77cda21e84ad> in <module>
19
20 decoder_outputs = decoder_dense(decoder_outputs)
---> 21 decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)
22
23 train_mean = train_dict['train_mean']
/lib/python3.6/site-packages/keras/legacy/interfaces.py in wrapper(*args, **kwargs)
89 warnings.warn('Update your `' + object_name + '` call to the ' +
90 'Keras 2 API: ' + signature, stacklevel=2)
---> 91 return func(*args, **kwargs)
92 wrapper._original_function = func
93 return wrapper
/lib/python3.6/site-packages/keras/engine/network.py in __init__(self, *args, **kwargs)
91 'inputs' in kwargs and 'outputs' in kwargs):
92 # Graph network
---> 93 self._init_graph_network(*args, **kwargs)
94 else:
95 # Subclassed network
/lib/python3.6/site-packages/keras/engine/network.py in _init_graph_network(self, inputs, outputs, name)
229 # Keep track of the network's nodes and layers.
230 nodes, nodes_by_depth, layers, layers_by_depth = _map_graph_network(
--> 231 self.inputs, self.outputs)
232 self._network_nodes = nodes
233 self._nodes_by_depth = nodes_by_depth
/lib/python3.6/site-packages/keras/engine/network.py in _map_graph_network(inputs, outputs)
1441 'The following previous layers '
1442 'were accessed without issue: ' +
-> 1443 str(layers_with_complete_input))
1444 for x in node.output_tensors:
1445 computable_tensors.append(x)
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("encoder:0", shape=(?, ?, 10), dtype=float32) at layer "encoder". The following previous layers were accessed without issue: []
很明显,我在做错事。当我只有一个编码器和解码器层时,一切都会按预期进行。当我尝试具有许多解码器层时,我会遇到问题。我不完全了解如何设置用于推理的多层解码器。