我使用Keras子类化API在Tensorflow中定义了一个编码器-解码器模型,该模型仅实现了模型的训练侧,但是足以开始看到很多问题,我正在这里寻找什么。” / p>
我在Tensorflow Github Repo https://github.com/tensorflow/tensorflow/issues/30533上打开了这个问题,在那里我解释了该模型上tensorflow 2.0所面临的一些错误。我达到了模型的新状态,我认为我可以进一步解决问题,同时修正了一些错误(但是我不能100%确定是否可以超越),如果可以通过实际问题进行改进,我需要帮助改进模型
这是我的完整模型,我将数据集简化为仅1个样本,以帮助重现代码,但是功能,错误和训练速度保持不变
import tensorflow as tf
import tensorflow.python.keras as kr
from tensorflow.python.keras.preprocessing import text as tx
import numpy as np
import random
import yaml
import os
class TextGenerator(tf.keras.Model):
"""description of class"""
def __init__(self, vocabulary_size=1, embedding_size=1):
super(TextGenerator, self).__init__()
self.vocabulary_size = vocabulary_size
self.embedding_size = embedding_size
self.encoder_mask = tf.keras.layers.Masking(0)
self.encoder_embedding = tf.keras.layers.Embedding(vocabulary_size, embedding_size, trainable=False)
self.encoder_lstm = tf.keras.layers.LSTM(embedding_size, "tanh", return_sequences=True, return_state=True)
self.decoder_mask = tf.keras.layers.Masking(0)
self.decoder_embedding = tf.keras.layers.Embedding(vocabulary_size, embedding_size, trainable=False)
self.decoder_lstm = tf.keras.layers.LSTM(embedding_size, "tanh", return_sequences=True)
self.decoder_hiden = tf.keras.layers.Dense(embedding_size, "sigmoid")
self.decoder_output = tf.keras.layers.Dense(1, "relu")
def call(self, inputs):
encoder_out = self.encoder_mask(inputs["questions"])
encoder_out = self.encoder_embedding(encoder_out)
encoder_out, state_h, state_c = self.encoder_lstm(encoder_out)
encoder_state = [state_h, state_c]
decoder_out = self.decoder_mask(inputs["answers"])
decoder_out = self.decoder_embedding(decoder_out)
decoder_out = self.decoder_lstm(decoder_out, initial_state=encoder_state)
decoder_out = self.decoder_hiden(decoder_out)
decoder_out = self.decoder_output(decoder_out)
decoder_out = tf.reshape(decoder_out, [tf.shape(decoder_out)[0], -1])
return decoder_out
padded_questions = [[9, 6, 313, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
padded_answers = [[1, 155, 447, 6, 7, 623, 11, 1006, 14, 183, 1007, 10, 1008, 624, 19, 62, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
padded_targets = [[155, 447, 6, 7, 623, 11, 1006, 14, 183, 1007, 10, 1008, 624, 19, 62, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
dataset = tf.data.Dataset.from_tensor_slices(({"questions":padded_questions, "answers":padded_answers}, padded_targets)).shuffle(100).batch(64)
model = TextGenerator(2000, 2)
model.compile(kr.optimizers.Adam(0.5), kr.losses.MeanSquaredError())
model.fit(dataset, epochs=100, verbose=2)
for sample, target in dataset.take(1):
model(sample)
模型永远不会收敛,我不知道这是否是由签名错误引起的问题
2019-07-19 10:30:38.769833:W tensorflow /核心/抓斗/优化器/implementation_selector.cc:199] 加载函数库时由于错误而跳过了优化: 无效的参数:函数 '__inference___backward_cudnn_lstm_431_607_specialized_for_training_gradients_text_generator_lstm_StatefulPartitionedCall_grad_StatefulPartitionedCall_at___inference_keras_scratch_graph_3717'和'__inference___backward_cudnn_lstm_431
两者都实现了“ lstm_2997a8f9-a216-4721-b0fd-26c01299835e
”,但它们的签名不匹配。
这是模型本身的错误吗?