Question

这是我在model.fit上尝试建立具有注意力（baidhanu）的sequence-2-Sequence（编码器-解码器模型）时遇到的错误我面临参数错误有人可以纠正我什么吗是确切的问题

#Bahdanu attention 
#parameters to pass this attention 
'''
1.Encoder state's i.e.., state_c, state_h
2.encoder_outputs 
3.decoder_embedding which is in decoder part 
4.you will get a context vector named "input_to_decoder" pass this as input to decoder lstm layer 
'''
def Attention_layer(state_h,state_c,encoder_outputs,decoder_embedding):

  d0 = tf.keras.layers.Dense(1024,name='dense_layer_1')
  d1 = tf.keras.layers.Dense(1024,name='dense_layer_2')
  d2 = tf.keras.layers.Dense(1024,name='dense_layer_3')
  #hidden_with_time_axis_1 = tf.keras.backend.expand_dims(state_h, 1)
  #hidden_with_time_axis_1 = state_h
  #hidden_with_time_axis_2 = tf.keras.backend.expand_dims(state_c, 1)
  #hidden_with_time_axis_2 = state_c
  #hidden_states = tf.keras.layers.concatenate([state_h,state_c],axis=-1)
  #all_states = tf.keras.layers.concatenate()
  score = d0(tf.keras.activations.tanh(encoder_outputs) + d1(state_h) + d2(state_c))
  attention_weights = tf.keras.activations.softmax(score, axis=1)
  context_vector = attention_weights * encoder_outputs
  context_vector = tf.keras.backend.sum(context_vector, axis=1)
  context_vector = tf.keras.backend.expand_dims(context_vector, 1)
  context_vector = tf.keras.backend.reshape(context_vector,[-1,-1,1024])
  input_to_decoder = tf.keras.layers.concatenate([context_vector,decoder_embedding], axis=-1)

  return input_to_decoder

以上是我的关注层

#Encoder inputs 
encoder_inputs = tf.keras.layers.Input(shape=(None,),name='encoder_input_layer')
encoder_embedding = tf.keras.layers.Embedding(vocab_size, 1024, mask_zero=True,name='encoder_embedding_layer')(encoder_inputs)
encoder_outputs , state_h , state_c = tf.keras.layers.LSTM(1024, return_state=True)(encoder_embedding)
# We discard `encoder_outputs` and only keep the states.
encoder_states = [state_h, state_c] 
# Set up the decoder, using `encoder_states` as initial state.
decoder_inputs = tf.keras.layers.Input(shape=(None,),name='decoder_input_layer')
# We set up our decoder to return full output sequences,
# and to return internal states as well. We don't use the. 
# return states in the training model, but we will use them in inference.
decoder_embedding = tf.keras.layers.Embedding(vocab_size, 1024, mask_zero=True,name='decoder_embedding_layer')(decoder_inputs)
decoder_lstm = tf.keras.layers.LSTM(1024, return_state=True, return_sequences=True)
#Attention layer which is defind in above function 
attention_layer = Attention_layer(state_h, state_c, encoder_outputs, decoder_embedding)
decoder_outputs, _, _ = decoder_lstm(attention_layer, initial_state=encoder_states)
decoder_dense = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(vocab_size, activation='softmax'))
output = decoder_dense(decoder_outputs)
# `encoder_input_data` & `decoder_input_data` into `decoder_target_data`
model = tf.keras.models.Model([encoder_inputs, decoder_inputs], output)
#compiling the model 
model.compile(optimizer='adam', loss='categorical_crossentropy')
#model summary
model.summary()

当我试图适应该功能时，我陷入了以下错误，我不了解它是什么

%%time 
model.fit([encoder_input_data, decoder_input_data], decoder_output_data, batch_size=86, epochs=10, validation_split=0.2) 
------------------------------------------------------------------------------------------------------------------------------
#Output :
Train on 4644 samples, validate on 1162 samples
Epoch 1/10
  86/4644 [..............................] - ETA: 8:18
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-34-781d7ca43c98> in <module>()
----> 1 get_ipython().run_cell_magic('time', '', 'model.fit([encoder_input_data, decoder_input_data], decoder_output_data, batch_size=86, epochs=10, validation_split=0.2) ')

14 frames
</usr/local/lib/python3.6/dist-packages/decorator.py:decorator-gen-60> in time(self, line, cell, local_ns)

<timed eval> in <module>()

/usr/local/lib/python3.6/dist-packages/six.py in raise_from(value, from_value)

InvalidArgumentError:  Only one input size may be -1, not both 0 and 1
     [[node model/tf_op_layer_Reshape/Reshape (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_distributed_function_16029]

Function call stack:
distributed_function

谁能帮助我我犯错的地方

Answer 1

这些行可能是您的问题：

encoder_inputs = tf.keras.layers.Input(shape=(None,),name='encoder_input_layer')
decoder_inputs = tf.keras.layers.Input(shape=(None,),name='decoder_input_layer')

您不能使用shape=(None,)，并且必须至少指定输入中具有的功能数。

要详细说明您得到的错误，将自动考虑批次尺寸，它期望0尺寸为-1（或等效为None）-您始终可以选择更改批次大小。但是维度1也不能为None（这是您当前正在设置的），因为只有一个非批处理特征维度。您的模型不知道要素输入的大小。

This answer提供了有关不同类型的模型输入的有效形状的更多信息。

InvalidArgumentError：只有一个输入大小可能是-1，而不是0和1

1 个答案: