我一直在尝试开发YOLO成本函数,如下所示。这是我第一次尝试在Tensorflow中开发自己的成本函数,并且不确定我是否正确采用该函数。首先,我的模型使用了许多中间步骤。我不确定这是否会以某种有意义的破坏性方式使计算图复杂化?或者,我正在使用腹肌。价值步骤,不确定是否会对我的反向传播技术产生负面影响?对于我是否正确解决此问题,任何帮助都将有所帮助。
我可以回答有关实施的任何问题。
注意-Z13是预测,y是真实值。我的模型(7x7)中有49个单元格,每个单元格都由7x1向量表示:[单元格中任何东西的概率,x中点,y中点,框宽,框高,prob dog,prob cat]。参考论文:{ {3}}深入解释了成本函数。
我认为我的前向道具或成本函数存在问题,因为我的模型没有学习有意义的表示形式。
https://arxiv.org/pdf/1506.02640.pdf
def cost_function(Z13,y,coord=5,noobj=0.5):
"""
Z13: shape (None,7,7,7)
y: shape (None,7,7,7)
"""
# Masks are used as classification score for box coords only applies to cell where actual bounding box is
c_mask_true = y[:,:,:,0:1] > 0 # Mask which determines which cell has bounding box
c_mask_false = y[:,:,:,0:1] < 1 # Mask for cells w/o bounding boxes
# Confidence scores
ci_guess_t = tf.boolean_mask(Z13[:,:,:,0:1],c_mask_true)
ci_guess_f = tf.boolean_mask(Z13[:,:,:,0:1],c_mask_false)
ci_act_t = tf.boolean_mask(y[:,:,:,0:1],c_mask_true)
ci_act_f = tf.boolean_mask(y[:,:,:,0:1],c_mask_false)
# Bounding box coordinated for ground truth box prediction
xi_guess = tf.boolean_mask(Z13[:,:,:,1:2],c_mask_true) # Midpoint x position
xi_act = tf.boolean_mask(y[:,:,:,1:2],c_mask_true)
yi_guess = tf.boolean_mask(Z13[:,:,:,2:3],c_mask_true) # Midpoint y position
yi_act = tf.boolean_mask(y[:,:,:,2:3],c_mask_true)
# Width:
wi_guess = tf.boolean_mask(Z13[:,:,:,3:4],c_mask_true) # Midpoint width pos.
wi_guess = tf.minimum(tf.sqrt(tf.abs(wi_guess)),wi_guess) # prevent sqrt(neg) and increase cost for neg prediction
wi_act = tf.sqrt(tf.boolean_mask(y[:,:,:,3:4],c_mask_true))
# Height:
hi_guess = tf.boolean_mask(Z13[:,:,:,4:5],c_mask_true) # Midpoint height pos.
hi_guess = tf.minimum(tf.sqrt(tf.abs(hi_guess)),hi_guess) # prevent sqrt(neg) and increase cost for neg prediction
hi_act = tf.sqrt(tf.boolean_mask(y[:,:,:,4:5],c_mask_true))
# Predicted classes:
class_g_dog = tf.boolean_mask(Z13[:,:,:,5:6],c_mask_true)
class_t_dog = tf.boolean_mask(y[:,:,:,5:6],c_mask_true)
class_g_cat = tf.boolean_mask(Z13[:,:,:,6:7],c_mask_true)
class_t_cat = tf.boolean_mask(y[:,:,:,6:7],c_mask_true)
# Parts correspond with the cost function equations above
part1 = coord * tf.reduce_sum(tf.square(xi_act - xi_guess)+tf.square(yi_act - yi_guess))
part2 = coord * tf.reduce_sum(tf.square(wi_act - wi_guess)+tf.square(hi_act - hi_guess))
part3 = tf.reduce_sum(tf.square(ci_act_t - ci_guess_t))
part4 = noobj * tf.reduce_sum(tf.square(ci_act_f - ci_guess_f))
part5 = tf.reduce_sum(tf.square(class_t_dog - class_g_dog)+tf.square(class_t_cat - class_g_cat))
total_cost = part1 + part2 + part3 + part4 + part5
return total_cost