Tensorflow-在执行mean_squared_error损失函数时

时间:2019-01-01 21:40:47

标签: tensorflow machine-learning deep-learning

我正在使用预先训练的Inception-resnet-v2模型进行迁移学习。从conv层之一中,我正在使用opencv和numpy操作提取最佳激活(最佳质量)以计算预测的界标。我正在应用的损失函数是mean_squared_error损失。不幸的是,当我执行此函数时,我收到一条错误消息,指出任何变量都没有渐变。自两个星期以来,我一直在努力解决这个问题,而且我不知道如何进行。在调试时,我可以看到在内部执行apply_gradients函数时出现了问题。我从这里搜索并使用了一些像这样的解决方案: ValueError: No gradients provided for any variable in Tensorflow selecting trainable variables to compute gradient "No variables to optimize" Tensorflow: How to replace or modify gradient? ...

此外,我尝试使用此很棒的教程https://code-examples.net/en/q/253d718来编写自己的带有梯度支持的操作,因为该解决方案将python和opencv代码包装在tensorflow中。不幸的是,问题仍然存在。使用TensorBoard跟踪从网络输出到mean_squared_error函数的路径,我可以看到该路径也是可用且连续的。

# Extracts the best predicted images from a specific activation 
layer 
# PYTHON function: get_best_images(...) -> uses numpy and opencv
# PYTHON function: extract_landmarks(...) -> uses numpy

# Endpoints is the conv layer that gets extracted
best_predicted = tf.py_func(get_best_images, [input, 
end_points['Conv2d_1a_3x3']], tf.uint8) # Gets best activation
best_predicted.set_shape(input.shape)

# Gets the predicted landmarks and processes both target and 
predicted for further calculation
proc_landmarks = tf.py_func(get_landmarks, [best_predicted, 
target_landmarks], [tf.int32, tf.int32])
                proc_landmarks[0].set_shape(target_landmarks.shape)  
# target landmarks
                proc_landmarks[1].set_shape(target_landmarks.shape) 
# predicted landmarks

# --> HERE COMES THE COMPUTATION TO PROCESS THE TARGET AND PREDICTED 
LANDMARKS  

# Flattens and reshapes the tensors to 1D (68,1)
target_flatten = tf.reshape(target_result[0], [-1])
target_flatten = tf.reshape(target_flatten, [68,1])
predicted_flatten = tf.reshape(predicted_result[1], [-1])
predicted_flatten = tf.reshape(predicted_flatten, [68,1])
edit_target_landmarks = tf.cast(target_flatten, dtype=tf.float32)
edit_predicted_landmarks = tf.cast(predicted_flatten, 
dtype=tf.float32)

# Calculating the loss
mse_loss = 
tf.losses.mean_squared_error(labels=edit_target_landmarks, 
predictions=edit_predicted_landmarks)

optimizer = tf.train.AdamOptimizer(learning_rate=0.001, 
name='ADAM_OPT').minimize(mse_loss) # <-- here does the error occur

错误消息是此消息(简而言之,仅列出了一些变量):

  

ValueError:未为任何变量提供渐变,请检查您的图形>在变量[“ InceptionResnetV2 / Conv2d_1a_3x3 / weights:0'shape =(3,3,3,32)之间不支持渐变的操作> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_1a_3x3 / BatchNorm / beta:0'shape =(32,)> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_2a_3x3 / weights:0'shape =(3,3,32,32 )> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_2a_3x3 / BatchNorm / beta:0'shape =(32,)> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_2b_3x3 / weights:0'shape =(3,3, 32,64)> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_2b_3x3 / BatchNorm / beta:0'shape =(64,)> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_3b_1x1 / weights:0'shape =(1 ,1,64,80)> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_3b_1x1 / BatchNorm / beta:0'shape =(80,)> dtype = float32_ref>“,”'InceptionResnetV2 / Conv2d_4a_3x3 / weights:0'shape =(3,3,80,192)> dtype = float32_ref>“,”''InceptionResnetV2 / Conv2d_4a_3x3 / BatchNorm / beta:0'shape =(192,)> dtype = float32_ref>“ ,“''InceptionResnetV2 / Mixed_5b / Branch_0 / Conv2d_1x1 / weights:0'shape =(1,1,> 192,96)dtype = float32_ref>”,“

编辑: 我已经使用本指南Override Tensorflow Backward-Propagation来计算了火车列表的前两个变量的梯度。基于此,我忘记了前向和后向传播函数中的第三个参数(在本指南中称为d参数),在我的情况下是网络的conv层输出。但是,我只计算了前两个梯度,而所有其他梯度都丢失了。我是否需要为每个可训练变量计算并返回反向传播函数的梯度?当我正确使用反向传播函数时,我们正在计算关于ops输入的导数,在我的情况下,它们是2个变量(目标变量和预测变量)和一个conv层输出变量(即return grad * op.inputs[0], grad * op.inputs[1], grad * op.inputs[2])。我认为所有可训练变量的整体计算都在定义了自定义梯度计算之后并在将opt.compute_gradient函数用作变量列表作为第二个参数的同时完成。我是对还是错?

我已经将TensorBoard输出的一部分发布给了mean_squared_error op。该图显示了我为简化问题而省略的其他损失函数。此损失函数效果很好。由于这个问题,缺少了从mean_squared_error函数到梯度计算的箭头。我希望这能提供更好的概述。 enter image description here

0 个答案:

没有答案