优化器的Tensorflow aggregation_method

时间:2017-05-16 11:55:39

标签: tensorflow deep-learning

我找不到有关tensorflow优化器中聚合方法的文档

我有以下代码行

train_op = optimizer.minimize(loss, global_step=batch, aggregation_method = tf.AggregationMethod.EXPERIMENTAL_TREE)

但是,此属性可以更改为

tf.AggregationMethod.EXPERIMENTAL_ACCUMULATE_N

有谁知道它做了什么? (我只知道当我使用LSTM的默认值时,它没有足够的内存来运行)

1 个答案:

答案 0 :(得分:3)

对于AggregationMethodEXPERIMENTAL_ACCUMULATE_NADD_NDEFAULT),accumulate_nadd_nadd_n在进行任何求和之前等待所有参数可用,而accumulate_n在其输入可用时求和。这可能会节省内存,但有一些挑剔的形状信息限制,因为它的当前实现需要创建一个临时变量。

有一些文档in the comments

      # The benefit of using AccumulateN is that its inputs can be combined
      # in any order and this can allow the expression to be evaluated with
      # a smaller memory footprint.  When used with gpu_allocator_retry,
      # it is possible to compute a sum of terms which are much larger than
      # total GPU memory.
      # AccumulateN can currently only be used if we know the shape for
      # an accumulator variable.  If this is not known, or if we only have
      # 2 grads then we fall through to the "tree" case below.