Question

使用经过tf.contrib.layers.recompute_grad训练的模型时，我无法避免出现此错误：

ValueError: The variables used on recompute were different than the variables originally
used. The function wrapped with @recompute_grad likley creates its own variable
scope with a default name and has been called twice in the same enclosing scope.
To fix, ensure each call to the function happens in its own unique variable
scope.

任何帮助将不胜感激。例如，我将recompute_grad与Conv2D一起使用，如下所示。我创建一个函数：

conv_cnt = -1

def Conv2D_mem_eff( input_tensor,
                    filters,
                    kernel_size,
                    kernel_regularizer,
                    bias_regularizer,
                    padding,
                    name ):

    global conv_cnt
    conv_cnt += 1

    with tf.variable_scope( name + '_{:d}'.format(conv_cnt),
                            use_resource = True ):

        def _x( inner_input_tensor ):

            x = Conv2D( filters = filters,
                        kernel_size = kernel_size,
                        padding = padding,
                        kernel_regularizer = kernel_regularizer,
                        bias_regularizer = bias_regularizer,
                        name =  name + '_{:d}'.format(conv_cnt) )(inner_input_tensor)

            return x

        _x = tf.contrib.layers.recompute_grad( _x )

        return _x( input_tensor )

然后在模型中使用它，我使用Lambda层，如下所示：

x = Lambda( Conv2D_mem_eff,
                        arguments = {'filters' : 3,
                                     'kernel_size' : (5,5),
                                     'kernel_regularizer' : l2,
                                     'bias_regularizer' : l2,
                                     'padding' : 'same',
                                     'name' : 'conv01'},
                        name = 'conv01' )(x)

其中l2 = regularizers.l2(0.001)。该模型使用较少的内存，并且训练非常快。我可以将模型加载到单独的文件中并进行预测，但是无法计算出简单的渐变，如下所示：

inp = mdl.input
outp = mdl.layers[lyr_idx].output

print('Input Layer: {}'.format(inp))
print('Output Layer: {}'.format(outp))

grad = K.gradients( outp, inp )[0]

其中

conv_cnt = -1 #This is reset in this new file
lyr_idx = -1

mdl = load_model( args.model, custom_objects={ 'tf' : tf,
                                               'Conv2D' : Conv2D,
                                               'conv_cnt' : conv_cnt } )

无论何时，我都会得到上面的ValueError。有人可以帮我吗？我在使用全局变量时做错了吗？

更新

我修改了tensorflow模块tf.contrib.layers.recompute_grad，以便打印出original和recomputed梯度在错误中的含义。这就是我得到的：

ORIGINAL VARIABLES

{<tf.Variable 'fc04/fc04_100000/fc04_100000/kernel:0' shape=(25, 10) dtype=float32>, <tf.Variable 'fc04/fc04_100000/fc04_100000/bias:0' shape=(10,) dtype=float32>}

RECOMPUTE VARIABLES

{<tf.Variable 'gradients/fc04/fc04_100000/IdentityN_grad/fc04_100000/fc04_100001/bias:0' shape=(10,) dtype=float32>, <tf.Variable 'gradients/fc04/fc04_100000/IdentityN_grad/fc04_100000/fc04_100001/kernel:0' shape=(25, 10) dtype=float32>}

ValueError：重新计算时使用的变量与最初使用的变量不同

0 个答案: