Question

我正在学习Udacity深度学习课程及其作业说明＆＃34;展示过度拟合的极端情况。将您的培训数据限制为几个批次。＆＃34;

我的问题是：

1）为什么缩小 num_steps, num_batches与过度拟合有关？我们没有添加任何变量，也没有增加W的大小。

在下面的代码中，num_steps曾经是3001，num_batches是128，解决方案只是将它们分别减少到101和3。

    num_steps = 101
    num_bacthes = 3

    with tf.Session(graph=graph) as session:
      tf.initialize_all_variables().run()
      print("Initialized")
      for step in range(num_steps):
        # Pick an offset within the training data, which has been randomized.
        # Note: we could use better randomization across epochs.
        #offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
        offset = step % num_bacthes
        # Generate a minibatch.
        batch_data = train_dataset[offset:(offset + batch_size), :]
        batch_labels = train_labels[offset:(offset + batch_size), :]
        # Prepare a dictionary telling the session where to feed the minibatch.
        # The key of the dictionary is the placeholder node of the graph to be fed,
        # and the value is the numpy array to feed to it.
        feed_dict = {tf_train_dataset : batch_data, tf_train_labels : batch_labels, beta_regul : 1e-3}
        _, l, predictions = session.run(
          [optimizer, loss, train_prediction], feed_dict=feed_dict)
        if (step % 2 == 0):
          print("Minibatch loss at step %d: %f" % (step, l))
          print("Minibatch accuracy: %.1f%%" % accuracy(predictions, batch_labels))
          print("Validation accuracy: %.1f%%" % accuracy(
            valid_prediction.eval(), valid_labels))
      print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(), test_labels))

此代码摘自解决方案：https://github.com/rndbrtrnd/udacity-deep-learning/blob/master/3_regularization.ipynb

2）有人可以解释＆＃34;偏移＆＃34;的概念在梯度下降？为什么我们必须使用它？

3）我已经尝试过num_steps并发现如果我增加num_steps，则准确度会提高。为什么？我应该如何用学习率来解释num_step？

Answer 1

1）当您训练神经网络以防止过度拟合时，设置早期停止条件是非常典型的。你没有添加新变量，但是使用早期停止条件，你无法集中精力地使用它们，更不等同。

2）在这种情况下＆＃34;偏移＆＃34;剩余的观察结果未用于minibatch（其余部门）

3）想想＆＃34;学习率＆＃34; as＆＃34;速度＆＃34;和＆＃34; num_steps＆＃34;作为＆＃34;时间＆＃34;。如果你跑的时间更长，你可能会进一步开车......但也许如果你开得更快也许你可能会崩溃并且不会更进一步......

Step，Gradient Descent Tensorflow中的批量大小

1 个答案: