Question

我想用Tensorflow实现YOLO (You Only Look Once)。但是我写了损失函数并训练网络，损失值非常大，如下：

2016-09-11 13:55:03.753679: step 0, loss = 3371113119744.00 (3.9 examples/sec; 2.548 sec/batch)  
2016-09-11 13:55:14.444871: step 10, loss = nan (19.8 examples/sec; 0.505 sec/batch)

0步骤已经非常巨大，其他步骤变为纳米值。我无法弄清楚原因是什么这是我写的关于YOLO的损失函数：

def inference_loss(y_out, y_true):
'''
Args:
  y_true: Ground Truth output
  y_out: Predicted output
  The form of the ground truth vector is:
  ######################################
  ##1225 values in total: 7*7=49 cells 
  ##each cell vector has 25 values: bounding box (x,y,h,w), class one hot vector (p1,p2,...,p20), objectness score (0 or 1)##
  ##49 * 25 = 1225 
  ######################################

Returns:
  The loss caused by y_out
'''    
lambda_coor = 5
lambda_noobj = 0.5

box_loss = 0.0
score_loss = 0.0
class_loss = 0.0

for i in range(49):
    #the first bounding box
    y_out_box1 = y_out[:,i*30:i*30+4]
    #the second bounding box
    y_out_box2 = y_out[:,i*30+4:i*30+8]
    #ground truth bounding box
    y_true_box = y_true[:,i*25:i*25+4]
    #l2 loss of the predicted bounding box
    box_loss_piece = tf.reduce_sum(tf.square(y_true_box - y_out_box1), 1) + tf.reduce_sum(tf.square(y_true_box - y_out_box2), 1)
    #bounding box loss
    box_loss_piece = box_loss_piece * lambda_coor * y_true[:,i*25+24]

    box_loss = box_loss + box_loss_piece 
    #predicted score
    y_out_score1 = y_out[:,i*30+8]
    y_out_score2 = y_out[:,i*30+9]
    #ground truth score
    y_true_score = y_true[:,i*25+24]
    #the first score
    score_loss1_piece = tf.square(y_true_score - y_out_score1) + tf.square(y_true_score - y_out_score2)
    #the second score
    score_loss2_piece = lambda_noobj * score_loss1_piece
    #score loss
    score_loss1_piece = score_loss1_piece * y_true[:,i*25+24]
    score_loss2_piece = score_loss2_piece * (1 - y_true[:,i*25+24]) 

    score_loss = score_loss + score_loss1_piece + score_loss2_piece
    #one hot predicted class vector and ground truth vector
    y_out_class = y_out[:,i*30+10:(i+1)*30]
    y_true_class = y_true[:,i*25+4:i*25+24]
    # class loss
    class_loss_piece = tf.reduce_sum(tf.square(y_true_class - y_out_class), 1)
    class_loss = class_loss + class_loss_piece * y_true[:,i*25+24]

#total loss of one batch
loss = tf.reduce_sum(box_loss+score_loss+class_loss, 0)
return loss

这是我写的训练代码：

def train_test():
    with tf.Graph().as_default():
        global_step = tf.Variable(0, trainable=False)

        data_batch_generator = yolo_inputs.generate_batch_data(voclabelpath, imagenamefile, BATCH_NUM, sample_number=10000, iteration = 5000)

        training_image_batch = tf.placeholder(tf.float32, shape = [BATCH_NUM, 448, 448, 3])
        training_label_batch = tf.placeholder(tf.float32, shape = [BATCH_NUM, 1225])

        #inference and loss
        yolotinyinstance = yolo_tiny.YOLO()
        yolotinyinstance.build(training_image_batch)        
        net_out = yolotinyinstance.fc12

        loss = inference_loss(net_out, training_label_batch)          

        train_op = train(loss, global_step)

        saver = tf.train.Saver(tf.all_variables())

        summary_op = tf.merge_all_summaries()

        init = tf.initialize_all_variables()

        sess = tf.Session()
        sess.run(init)          

        summary_writer = tf.train.SummaryWriter(TRAIN_DIR, sess.graph)

        step = 0

        for x,y in data_batch_generator:
            start_time = time.time()

            _, loss_value = sess.run([train_op, loss], feed_dict = {training_image_batch: x, 
                                     training_label_batch:y})

这让我困惑了一段时间。有人可以帮忙吗？非常感谢你。

Tensorflow：巨大的损失函数值输出

0 个答案: