我正在开发一个encoder-decoder
聊天机器人,该机器人由embedding layer
,两层LSTM
和两层fully connected layer
组成,位于解码器的顶部。
在加载checkpoint file
之后,loss
比我上次保存模型时要高得多,并且聊天机器人的结果比预期的还要差。但是,该模型尚未恢复到其初始状态。这意味着,如果我在模型损失为2.4时保存模型,它将加载4-5损失而不是10(这是模型开始学习之前的损失)。
此外,该模型在加载权重之后学习得更快,这使我相信某些权重已成功加载,而有些权重尚未成功。
我正在使用以下代码构建模型并在checkpoint
函数中加载__init__
:
self.__gather_data()
self.__build_model()
tf.global_variables_initializer().run(session=self.sess)
self.saver = tf.train.Saver(tf.global_variables())
try:
self.saver.restore(self.sess, self.checkpoint_path)
except:
print('Starting from scratch.')
这是我在__build_model
函数中构建模型的方式:
# placeholders
with tf.variable_scope(self.scope + '-placeholders'):
self.inputs = tf.placeholder(tf.int32,[None, self.input_length], name='inputs')
self.outputs = tf.placeholder(tf.int32, [None, None], name='outputs')
self.targets = tf.placeholder(tf.int32, [None, None], name='targets')
# embedding
with tf.variable_scope(self.scope + 'embedding'):
self.input_embedding = tf.Variable(tf.ones((self.vocab_size, self.embed_size)))
self.output_embedding = tf.Variable(tf.ones((self.vocab_size, self.embed_size)))
input_embed = tf.nn.embedding_lookup(self.input_embedding, self.inputs)
output_embed = tf.nn.embedding_lookup(self.output_embedding, self.outputs)
# encoder
with tf.variable_scope(self.scope + '-encoder'):
lstm_enc_1 = tf.contrib.rnn.LSTMCell(self.hidden_size, reuse=tf.AUTO_REUSE)
lstm_enc_2 = tf.contrib.rnn.LSTMCell(self.hidden_size, reuse=tf.AUTO_REUSE)
_, last_state = tf.nn.dynamic_rnn(tf.contrib.rnn.MultiRNNCell(cells=[lstm_enc_1, lstm_enc_2]), inputs=input_embed, dtype=tf.float32)
# decoder
with tf.variable_scope(self.scope + '-decoder'):
lstm_dec_1 = tf.contrib.rnn.LSTMCell(self.hidden_size, reuse=tf.AUTO_REUSE)
lstm_dec_2 = tf.contrib.rnn.LSTMCell(self.hidden_size, reuse=tf.AUTO_REUSE)
dec_outputs, _ = tf.nn.dynamic_rnn(tf.contrib.rnn.MultiRNNCell(cells=[lstm_dec_1, lstm_dec_2]), inputs=output_embed, initial_state=last_state, dtype=tf.float32)
self.logits = tf.contrib.layers.fully_connected(dec_outputs, num_outputs=self.vocab_size, activation_fn=None, reuse=tf.AUTO_REUSE, scope='fully_connected')
# loss and optimizer
with tf.variable_scope(self.scope + '-optimizing'):
self.loss = tf.contrib.seq2seq.sequence_loss(self.logits, self.targets, tf.ones([self.batch_size, self.input_length]))
self.optimizer = tf.train.RMSPropOptimizer(0.001).minimize(self.loss)
我正在使用此功能在训练时节省重量:
self.saver.save(self.sess, self.checkpoint_path)