Question

在MNIST example中，优化器设置如下

# Optimizer: set up a variable that's incremented once per batch and
# controls the learning rate decay.
  batch = tf.Variable(0, dtype=data_type())
# Decay once per epoch, using an exponential schedule starting at 0.01.
  learning_rate = tf.train.exponential_decay(
      0.01,                # Base learning rate.
      batch * BATCH_SIZE,  # Current index into the dataset.
      train_size,          # Decay step.
      0.95,                # Decay rate.
      staircase=True)
  # Use simple momentum for the optimization.
   optimizer = tf.train.MomentumOptimizer(learning_rate,
                                     0.9).minimize(loss,
                                                   global_step=batch)

在培训过程中，

for step in xrange(int(num_epochs * train_size) // BATCH_SIZE):

    # skip some code here
    sess.run(optimizer, feed_dict=feed_dict)

我的问题是，在定义learning_rate时，他们使用batch * batch_size来定义全局步骤。但是，在训练迭代中，我们只有变量步。代码如何将步骤信息连接（或传递）到tf.train.exponential_decay中的全局步骤参数我不太清楚这个python参数传递机制是如何工作的。

Answer 1

从您链接的代码中，batch 是全局步骤。它的值由优化器更新。学习节点将其作为输入。

命名可能是一个问题。 batch仅表示用于培训的当前批次的编号（大小为BATCH_SIZE）。也许更好的名称可能是step甚至是global_step。

大多数global_step代码似乎都是in a single source file。它很短，也许是了解这些部分如何协同工作的好方法。

关于在小批量优化中设置全局步骤信息

1 个答案: