为什么多gpu培训需要名称范围

时间:2018-06-08 22:31:43

标签: tensorflow scope gpu distributed

我想使用多个gpus来训练网络。在研究cifar10_multi_gpu_train时,我发现变量在相同的变量范围内共享,但名称范围不同

...
with tf.variable_scope(tf.get_variable_scope()):
      for i in xrange(FLAGS.num_gpus):
        with tf.device('/gpu:%d' % i):
          with tf.name_scope('%s_%d' % (cifar10.TOWER_NAME, i)) as scope:
            # Dequeues one batch for the GPU
            image_batch, label_batch = batch_queue.dequeue()
            # Calculate the loss for one tower of the CIFAR model. This function
            # constructs the entire CIFAR model but shares the variables across
            # all towers.
            loss = tower_loss(scope, image_batch, label_batch)

            # Reuse variables for the next tower.
            tf.get_variable_scope().reuse_variables()

            # Retain the summaries from the final tower.
            summaries = tf.get_collection(tf.GraphKeys.SUMMARIES, scope)

            # Calculate the gradients for the batch of data on this CIFAR tower.
            grads = opt.compute_gradients(loss)

            # Keep track of the gradients across all towers.
tower_grads.append(grads)
...

我知道变量范围用于共享变量。有人可以向我解释为什么我们需要名称范围吗?感谢

0 个答案:

没有答案