双向包装中的keras lstm模型 - 将值传递给下一层

时间:2018-03-15 00:59:58

标签: python-3.x tensorflow keras

我正在研究这种keras模型。它来自nmt示例,并且是一个聊天机器人。具体问题是关于LSTM a的双向包装器。双包装器返回5个项目。一个是返回状态,另外四个是h_states和c_states,一个用于向前方向,一个用于向后方向。不知何故,我想将它们提供给下一个LSTM层。我似乎并没有有效地做到这一点。首先,我不知道哪个是h,哪个是c,哪两个是前进的,哪两个是后退的。我用来将值传递给下一个lstm层的方法并不好。谁能帮我吗?我在linux计算机上使用tensorflow 1.6.0后端和Keras 2.1.5。

    def embedding_model_lstm( words,
                         embedding_weights_a=None,
                         trainable=False,
                         skip_embed=False,
                         return_sequences_b=False):

        lstm_unit_a = units
        lstm_unit_b = units # * 2
        embed_unit = 100 # int(hparams['embed_size'])

        x_shape = (tokens_per_sentence,)


        valid_word_a = Input(shape=x_shape)
        valid_word_b = Input(shape=x_shape)

        embeddings_a = Embedding(words,embed_unit ,
                                 weights=[embedding_weights_a],
                                 input_length=tokens_per_sentence,
                                 trainable=trainable
                                 )

        embed_a = embeddings_a(valid_word_a)

        ### encoder for training ###
        lstm_a = Bidirectional(LSTM(units=lstm_unit_a,
                                    return_sequences=True,
                                    return_state=True,
                                    #recurrent_dropout=0.2,
                                    input_shape=(None,),
                                    ), merge_mode='ave')

        recurrent_a, rec_a_1, rec_a_2, rec_a_3, rec_a_4 = lstm_a(embed_a) 

        concat_a_1 = Average()([rec_a_1, rec_a_3])
        concat_a_2 = Average()([rec_a_2, rec_a_4])

        lstm_a_states = [concat_a_1, concat_a_2]

        embed_b = embeddings_a(valid_word_b)

        lstm_b = LSTM(units=lstm_unit_b ,
                      #recurrent_dropout=0.2,
                      return_sequences=return_sequences_b,
                      return_state=True
                      )

        recurrent_b, inner_lstmb_h, inner_lstmb_c  = lstm_b(embed_b, initial_state=lstm_a_states)

        dense_b = Dense(embed_unit, input_shape=(tokens_per_sentence,),
                        activation='relu', #softmax or relu
                        )

        decoder_b = dense_b(recurrent_b) 

        model = Model([valid_word_a,valid_word_b], decoder_b) 

        ### encoder for inference ###
        model_encoder = Model(valid_word_a, lstm_a_states)

        ### decoder for inference ###
        input_h = Input(shape=(None, ))
        input_c = Input(shape=(None, ))

        inputs_inference = [input_h, input_c]


        embed_b = embeddings_a(valid_word_b)
        outputs_inference, outputs_inference_h, outputs_inference_c = lstm_b(embed_b,
                                                                             initial_state=inputs_inference)

        outputs_states = [outputs_inference_h, outputs_inference_c]

        dense_outputs_inference = dense_b(outputs_inference)

        ### inference model ###
        model_inference = Model([valid_word_b] + inputs_inference,
                                [dense_outputs_inference] +
                                outputs_states)

        return model, model_encoder, model_inference

我正在使用python3。

0 个答案:

没有答案