最近,我跟随this instruction,尝试将transformer generative model on TensorFlow official website转换为TensorFlow.js模型。 tf.keras模型在Python环境中进行了培训和测试,并且运行良好。但是,转换后的模型效果不佳。以下是用于构建tf.keras模型的代码。
# Inputs:
decoder_input = Input(shape=(None, 5), name='decoder_input')
batch_z = Input(shape=(self.hps['z_size'],))
def repeat_vector(args):
layer_to_repeat = args[0]
sequence_layer = args[1]
return RepeatVector(K.shape(sequence_layer)[1])(layer_to_repeat)
tile_z = Lambda(repeat_vector, output_shape=(None, self.hps['z_size'])) ([batch_z, decoder_input])
decoder_full_input = Concatenate()([decoder_input, tile_z])
# Apply decoder on new inputs
decoder_output, attention = self.transformer_decoder(decoder_full_input, True)
# Apply original output layer
model_output = self.output(decoder_output)
decoder_model = Model(inputs = [decoder_input, batch_z], outputs = decoder_output)
检查后,我发现了替换的
decoder_input = Input(shape=(None, 5), name='decoder_input')
与
decoder_input = Input(shape=(some integer number, 5), name='decoder_input')
可以使模型正常运行,但是,由于生成模型适用于非固定长度序列,因此固定输入形状 shape =(某些整数,5) 不符合要求。如何解决此问题?