使用经过tf.contrib.layers.recompute_grad
训练的模型时,我无法避免出现此错误:
ValueError: The variables used on recompute were different than the variables originally
used. The function wrapped with @recompute_grad likley creates its own variable
scope with a default name and has been called twice in the same enclosing scope.
To fix, ensure each call to the function happens in its own unique variable
scope.
任何帮助将不胜感激。例如,我将recompute_grad
与Conv2D
一起使用,如下所示。我创建一个函数:
conv_cnt = -1
def Conv2D_mem_eff( input_tensor,
filters,
kernel_size,
kernel_regularizer,
bias_regularizer,
padding,
name ):
global conv_cnt
conv_cnt += 1
with tf.variable_scope( name + '_{:d}'.format(conv_cnt),
use_resource = True ):
def _x( inner_input_tensor ):
x = Conv2D( filters = filters,
kernel_size = kernel_size,
padding = padding,
kernel_regularizer = kernel_regularizer,
bias_regularizer = bias_regularizer,
name = name + '_{:d}'.format(conv_cnt) )(inner_input_tensor)
return x
_x = tf.contrib.layers.recompute_grad( _x )
return _x( input_tensor )
然后在模型中使用它,我使用Lambda
层,如下所示:
x = Lambda( Conv2D_mem_eff,
arguments = {'filters' : 3,
'kernel_size' : (5,5),
'kernel_regularizer' : l2,
'bias_regularizer' : l2,
'padding' : 'same',
'name' : 'conv01'},
name = 'conv01' )(x)
其中l2 = regularizers.l2(0.001)
。该模型使用较少的内存,并且训练非常快。我可以将模型加载到单独的文件中并进行预测,但是无法计算出简单的渐变,如下所示:
inp = mdl.input
outp = mdl.layers[lyr_idx].output
print('Input Layer: {}'.format(inp))
print('Output Layer: {}'.format(outp))
grad = K.gradients( outp, inp )[0]
其中
conv_cnt = -1 #This is reset in this new file
lyr_idx = -1
mdl = load_model( args.model, custom_objects={ 'tf' : tf,
'Conv2D' : Conv2D,
'conv_cnt' : conv_cnt } )
无论何时,我都会得到上面的ValueError
。有人可以帮我吗?我在使用全局变量时做错了吗?
更新
我修改了tensorflow模块tf.contrib.layers.recompute_grad
,以便打印出original
和recomputed
梯度在错误中的含义。这就是我得到的:
ORIGINAL VARIABLES
{<tf.Variable 'fc04/fc04_100000/fc04_100000/kernel:0' shape=(25, 10) dtype=float32>, <tf.Variable 'fc04/fc04_100000/fc04_100000/bias:0' shape=(10,) dtype=float32>}
RECOMPUTE VARIABLES
{<tf.Variable 'gradients/fc04/fc04_100000/IdentityN_grad/fc04_100000/fc04_100001/bias:0' shape=(10,) dtype=float32>, <tf.Variable 'gradients/fc04/fc04_100000/IdentityN_grad/fc04_100000/fc04_100001/kernel:0' shape=(25, 10) dtype=float32>}