我正在研究这种keras模型。它来自nmt示例,并且是一个聊天机器人。具体问题是关于LSTM a的双向包装器。双包装器返回5个项目。一个是返回状态,另外四个是h_states和c_states,一个用于向前方向,一个用于向后方向。不知何故,我想将它们提供给下一个LSTM层。我似乎并没有有效地做到这一点。首先,我不知道哪个是h,哪个是c,哪两个是前进的,哪两个是后退的。我用来将值传递给下一个lstm层的方法并不好。谁能帮我吗?我在linux计算机上使用tensorflow 1.6.0后端和Keras 2.1.5。
def embedding_model_lstm( words,
embedding_weights_a=None,
trainable=False,
skip_embed=False,
return_sequences_b=False):
lstm_unit_a = units
lstm_unit_b = units # * 2
embed_unit = 100 # int(hparams['embed_size'])
x_shape = (tokens_per_sentence,)
valid_word_a = Input(shape=x_shape)
valid_word_b = Input(shape=x_shape)
embeddings_a = Embedding(words,embed_unit ,
weights=[embedding_weights_a],
input_length=tokens_per_sentence,
trainable=trainable
)
embed_a = embeddings_a(valid_word_a)
### encoder for training ###
lstm_a = Bidirectional(LSTM(units=lstm_unit_a,
return_sequences=True,
return_state=True,
#recurrent_dropout=0.2,
input_shape=(None,),
), merge_mode='ave')
recurrent_a, rec_a_1, rec_a_2, rec_a_3, rec_a_4 = lstm_a(embed_a)
concat_a_1 = Average()([rec_a_1, rec_a_3])
concat_a_2 = Average()([rec_a_2, rec_a_4])
lstm_a_states = [concat_a_1, concat_a_2]
embed_b = embeddings_a(valid_word_b)
lstm_b = LSTM(units=lstm_unit_b ,
#recurrent_dropout=0.2,
return_sequences=return_sequences_b,
return_state=True
)
recurrent_b, inner_lstmb_h, inner_lstmb_c = lstm_b(embed_b, initial_state=lstm_a_states)
dense_b = Dense(embed_unit, input_shape=(tokens_per_sentence,),
activation='relu', #softmax or relu
)
decoder_b = dense_b(recurrent_b)
model = Model([valid_word_a,valid_word_b], decoder_b)
### encoder for inference ###
model_encoder = Model(valid_word_a, lstm_a_states)
### decoder for inference ###
input_h = Input(shape=(None, ))
input_c = Input(shape=(None, ))
inputs_inference = [input_h, input_c]
embed_b = embeddings_a(valid_word_b)
outputs_inference, outputs_inference_h, outputs_inference_c = lstm_b(embed_b,
initial_state=inputs_inference)
outputs_states = [outputs_inference_h, outputs_inference_c]
dense_outputs_inference = dense_b(outputs_inference)
### inference model ###
model_inference = Model([valid_word_b] + inputs_inference,
[dense_outputs_inference] +
outputs_states)
return model, model_encoder, model_inference
我正在使用python3。