我很难理解seq2seq模型中编码器中的反向传播是如何工作的。没有标签,因此无法计算反向传播的错误,但LSTM层的权重会以某种方式更新。
l_enc_input = Input(batch_shape=(batch_size, None, embedding_size))
l_enc_lstm = LSTM(encoding_size, return_sequences=False, return_state=True, stateful=True, dropout=0.2)
l_dec_input = Input(batch_shape=(batch_size, None, embedding_size))
l_dec_lstm = LSTM(encoding_size, return_sequences=False, stateful=True, dropout=0.2)
l_dec_dense = Dense(embedding_size, activation="softmax")
t_enc_out = l_enc_lstm(l_enc_input)
state = t_enc_out[1:]
t_dec_out = l_dec_dense(l_dec_lstm(l_dec_input, initial_state=state))
model_train = Model(inputs=[l_enc_input, l_dec_input], outputs=[t_dec_out])
model_train.compile(optimizer="adam", loss="categorical_crossentropy")