我正在研究应该产生笔迹的RNN,并且我已经尝试训练模型数周了。该模型确实有效,但结果却中等,因为训练总是在很早的时候停止。
培训始终会引发以下异常:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Found Inf or NaN global norm.
我的合作伙伴和我的第一个猜测是,我们手上有一个爆炸性的梯度问题。所以到目前为止,我们已经尝试过:
这些方法中只有一种可以防止错误发生,但缺点是模型的法向损失仍然很高,并且表现不佳。
错误在这里发生
grads = tf.gradients(self.cost, tvars)
grads, _ = tf.clip_by_global_norm(grads, params.grad_clip) #tf.gradients(ys, xs) calculates the gradient of ys w.r.t
self.train_op = self.optimizer.apply_gradients(zip(grads, tvars)) # training operation (learning)
这些是我们的参数:
self.batch_size=32 if self.train else 1 #only for training
self.tsteps=200 if self.train else 1 #only for backprob in LSTM cell(time steps to complete the sentence)
self.data_scale = 100 #amount to scale data down before training
self.limit = 500
self.tsteps_per_ascii=25 #estimation for one char at gaussian conv./char window
self.alphabet=" abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
self.data_dir="./data/"
self.len_threshold=1
#mode
self.rnn_size = 100
self.dropout = 0.85 #probability of keeping neuron during dropout; architecture learns more general
self.kmixtures = 1 #number of gaussian mixtures for character window
self.nmixtures = 8 #number of gaussian mixtures
self.learning_rate = 0.00001 #learing rat
self.grad_clip = 10. # clip gradients to this magnitude (avoid exploding gradiend
self.optimizer = 'rms' # adam or rm
self.lr_decay = 1.0
self.decay = 0.95
self.momentum = 0.9