TensorFlow火车损失与验证损失

时间:2018-10-03 22:43:19

标签: python tensorflow machine-learning

我正在基于示例学习Tensorflow建模LSTM。我遵循2年前TF 1.0的示例。我将成本函数定义为:

cost = tf.reduce_mean(tf.losses.absolute_difference(predictions=predictions, labels=target))

然后在训练周期中使用它:

for _batch in range(b_per_epoch):
    batch_xs, batch_ys, leng = get_batch(_batch, batch_size_i, x_, y_)
    # run batch
    res = sess.run([optimizer, cost, grads, cost_summary],
                   feed_dict={input: batch_x,
                              target: batch_y,
                              lens: leng
                              })
   # Calculate average
   cum_cost += res[1]
   train_cost = cum_cost / (_batch + 1)

这很好。然后,我尝试在每个批次的末尾验证与训练数据分开的数据:

# Test the validation sample in batches
for _batch_t in range(b_per_epoch_t):
   test_xs_t, test_ys_t, leng_t = get_batch(_batch_t, batch_size_i, xt_, yt_)
    # Then evaluate the loss
    resu = sess.run([cost,
                     cost_val_summary],
                     feed_dict={input: test_xs_t,
                                target: test_ys_t,
                                lens: leng_t})
     cum_cost_t += resu[0]
     test_cost = cum_cost_t / (_batch_t + 1)

没有错误,但是我看到不一致的结果。 train_cost从第0阶段结束时的0.65开始,到第30阶段下降到0.1,因此网络似乎可以学习。同时test_cost在第0个阶段结束时以0.4开始,并且在所有第一个阶段始终保持在0.38-0.43的范围内。我相信这不是过度拟合,而是编码错误。我在做错什么吗?

0 个答案:

没有答案