如何将推理模型应用于(Seq2Seq +注意模型)

时间:2019-12-13 07:27:43

标签: python tensorflow machine-learning keras deep-learning

我在Keras中实现了seq2seq +注意模型,现在我正在尝试为我的seq2seq +注意实现推理模型,但由于形状错误而有人发现了此错误

 #Encoder inputs 
encoder_inputs = tf.keras.layers.Input(shape=(None,))
encoder_embedding = tf.keras.layers.Embedding(vocab_size, 1024, mask_zero=True)(encoder_inputs)
encoder_outputs, state_h, state_c = tf.keras.layers.LSTM(1024, return_state=True, return_sequences=True)(encoder_embedding)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c] 

# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = tf.keras.layers.Input(shape=(None,))
# We set up our decoder to return full output sequences,
# and to return internal states as well. We don't use the 
# return states in the training model, but we will use them in inference.
decoder_embedding = tf.keras.layers.Embedding(vocab_size, 1024, mask_zero=True)(decoder_inputs)
decoder_lstm = tf.keras.layers.LSTM(1024, return_state=True, return_sequences=True)
decoder_outputs, _, _ = decoder_lstm(decoder_embedding, initial_state=encoder_states)

#Attention layer from tf.keras.Attention 
addtive_attention_layer = tf.keras.layers.AdditiveAttention()
#inputs to attention layer 
inputs_to_attention = [encoder_outputs,decoder_outputs]
#fitting attention layer with inputs 
attention_layer = addtive_attention_layer(inputs_to_attention)
#concatenating attention outputs with encoder ouputs 
output_attention_layer = tf.keras.layers.Concatenate()([encoder_outputs, attention_layer])
#passing into dense layer 
decoder_dense = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(vocab_size, activation=tf.keras.activations.softmax))
#decoder outputs
output = decoder_dense(output_attention_layer)
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = tf.keras.models.Model([encoder_inputs, decoder_inputs], output)
#compiling the model 
model.compile(optimizer=tf.keras.optimizers.Adam(), loss='categorical_crossentropy')
#model summary 
model.summary() 
#model fitting
model.fit([encoder_input_data, decoder_input_data], decoder_output_data, batch_size=86, epochs=20, validation_split=0.2) 

上面是我在推理模型下尝试过的Seq-2-Seq +注意模型是否正确...?

#inference model 
#We pass the 2 inputs to encoder model i.e.., cell state and hidden state
encoder_model = tf.keras.models.Model(encoder_inputs, encoder_states)
decoder_state_input_h = tf.keras.layers.Input(shape=(1024,))
decoder_state_input_c = tf.keras.layers.Input(shape=(1024,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]
decoder_outputs, state_h, state_c = decoder_lstm(, initial_state=decoder_states_inputs)
decoder_states = [state_h, state_c]

decoder_outputs = decoder_dense(decoder_outputs)
#For decoder model we pass 2 inputs one is encoder outputs and second one is final_answers
decoder_model = tf.keras.models.Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)

我遇到了错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-117-801bd18ae51a> in <module>()
      6 decoder_states = [state_h, state_c]
      7 
----> 8 decoder_outputs = decoder_dense(decoder_outputs)
      9 #For decoder model we pass 2 inputs one is encoder outputs and second one is final_answers
     10 decoder_model = tf.keras.models.Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs] + decoder_states)

1 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
    222                                ' is incompatible with layer ' + layer_name +
    223                                ': expected shape=' + str(spec.shape) +
--> 224                                ', found shape=' + str(shape))
    225 
    226 

ValueError: Input 0 is incompatible with layer time_distributed_23: expected shape=[None, None, 2048], found shape=[None, None, 1024]

0 个答案:

没有答案