我正在使用VGG16架构和自适应学习率。当我使用adagrad
和adadelta
优化器时,我的代码运行良好。但是,当我使用adam
优化器时会出现问题。我的代码在30或40个纪元后生成assert not np.isnan(loss_value), 'Model diverged with loss = NaN' error
。我试图最小化学习速度(虽然不是必需的),批处理大小,增加的epsilon值等,但这只是增加了时期数,但没有解决我的问题。 我检查了StackOverflow
中的上一条帖子,并尝试通过添加一些具有log
值的分数来更改损失函数,但是没有用。我正在使用此{{ 3}}结构。
丢失功能
def loss(logits, labels):
hazard_ratio = tf.exp(logits)
cumsum = tf.cumsum(hazard_ratio)
likelihood = tf.log(tf.gather(cumsum, tf.reshape(labels, [-1]))) + tf.reduce_max(logits)
diff = tf.subtract(logits,likelihood)
num = tf.reshape(diff, [-1]) * tf.cast(labels, tf.float32)
reduce = - (tf.reduce_sum(num))
#reduce = - (tf.reduce_mean(num))
tf.add_to_collection('losses', reduce)
return reduce, tf.add_n(tf.get_collection('losses'))
添加模型损失汇总
def _add_loss_summaries(total_loss):
"""Add summaries for losses in CIFAR-10 model.
Generates moving average for all losses and associated summaries for
visualizing the performance of the network.
Args:
total_loss: Total loss from loss().
Returns:
loss_averages_op: op for generating moving averages of losses.
"""
# Compute the moving average of all individual losses and the total loss.
loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg')
losses = tf.get_collection('losses')
loss_averages_op = loss_averages.apply(losses + [total_loss])
# Attach a scalar summary to all individual losses and the total loss; do the
# same for the averaged version of the losses.
for l in losses + [total_loss]:
# Name each loss as '(raw)' and name the moving average version of the loss
# as the original loss name.
tf.summary.scalar(l.op.name + ' (raw)', l)
tf.summary.scalar(l.op.name, loss_averages.average(l))
return loss_averages_op
回溯(最近通话最近):
File "train.py", line 726, in <module>
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "train.py", line 722, in main
train()
File "train.py", line 626, in train
assert not np.isnan(loss_value), 'Model diverged with loss = NaN'
AssertionError: Model diverged with loss = NaN
请