Question

继TensorFlow的ML初学者的MNIST之后，我们学习了最基本的SGD，学习率为0.5，批量100和1000步这样

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)`
...
for i in range(1000):
  batch_xs, batch_ys = mnist.train.next_batch(100)
  sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

在CNTK中，直观的等价

SGD = {
    minibatchSize = 100
    maxEpochs = 1000
    learningRatesPerMB = 0.5
}

看起来它正在做更多的计算，至少它肯定更加冗长。

CNTK中的 minibatch 和 epochs 的概念与我所看到的不同，也就是它对待学习率的方式。

TensorFlow中基本SGD的直接等效（或最接近可能）是什么？每个概念如何在每个框架之间进行转换？

Answer 1

看起来TensorFlow和CNTK对迷你批次的定义相同：

'Minibatch size' in CNTK means the number of samples processed between model updates

时代是CNTK在TensorFlow中更加明智，即在火车上运行多少个会话。

maxEpochs: maximum number of epochs to run.

learningRatesPerMB有点不同：

this will be converted into learningRatesPerSample by dividing the values by the specified 'minibatchSize'

learningRatesPerSample类似于TensorFlow的学习率。

CNTK关于SGD的文件：https://developer.github.com/v3/oauth_authorizations/#create-a-new-authorization

什么是TensorFlow上简单SGD的CNTK等价物？

1 个答案: