Question

我正在使用张量流训练我自己的对象定位系统，输入的是形状为(1170,765, 1)的图像，并且系统只有一个目标类别，我的输出向量y_hat是如下所示：[P, x, y, w, h]，其中P表示图像中对象存在的概率，x,y,w,h是边界框的坐标和尺寸。

考虑到我的成本函数是：(y_1)(MSE(y,y_hat)) + (1-y_1)(MSE(y_1,y_hat_1))（摘自Andrew Ng的深度学习专业知识。）

我正在按照以下例程计算AP分数：

def AP(y,y_hat):

    y_scores = y_hat[:,0]
    y_hat_bbox = y_hat[:,1:]
    y_bbox = y[:,1:]
    y_true = np.zeros(y_scores.shape[0])

    for i in range(y_true.shape[0]):
        y_true[i] = iou(get_corners(y_hat_bbox[i]), get_corners(y_bbox[i]), True)
    AP = average_precision_score(y_true = y_true, y_score = y_scores)
return AP

average_precision_score为the one provided by scikit-Learn，如果iou值大于给定阈值，则iou基本返回 1.0 ，否则返回 0.0 >

我正在使用3个卷积层，其中两个的步幅为2，随后是三个最大的池化层，步幅为4（这是为了快速减小输入数据的大小）最后是35个完全连接的神经元，最后是5个神经元输出层。

这是训练功能：

def conv_train(costW,LRates,RegParam):    
    costs = []
    n_samples = X_train.shape[0]  

    varsnames = tf.trainable_variables()
    lossL2 = tf.add_n([ tf.nn.l2_loss(v) for v in varsnames if 'bias' not in v.name]) * RegParam    
    cost_pos = costW*tf.multiply( y[:,0],tf.reduce_mean(tf.squared_difference(predictions,y),1))
    cost_neg = tf.multiply(tf.subtract(1.0,y[:,0]),tf.squared_difference(predictions[:,0],y[:,0]))  
    cost = tf.reduce_mean(tf.add(cost_pos,cost_neg)) + lossL2  
    optimizer = tf.train.AdamOptimizer(learning_rate = LRates).minimize(cost) 

    tf.set_random_seed(1)
    sess = tf.InteractiveSession()
    sess.run(tf.global_variables_initializer())
    saver = tf.train.Saver()
    seed = 0

    for epoch in range(training_epochs):
        epoch_cost = 0.  
        num_minibatches = int(n_samples / minibatch_size)
        print("Epoca " + str(seed+1))
        print(datetime.datetime.time(datetime.datetime.now()))
        seed = seed + 1
        minibatches = random_mini_batches(X_train, y_train,mini_batch_size=minibatch_size, seed=seed)
        for minibatch in minibatches:
            (minibatch_X, minibatch_y) = minibatch
            _ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, y: minibatch_y})
            epoch_cost += minibatch_cost / num_minibatches
        print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
        sys.stdout.flush()
        costs.append(epoch_cost)
    saver.save(sess, logPath + "/Model/model.ckpt")

    plot_cost(costs)

    test_results = predictions.eval({X: X_test})
    train_result = predictions.eval({X: X_train})

    AP_test = write_log(y = y_train, y_hat = train_result, flag = "Train")
    AP = write_log(y = y_test, y_hat = test_results, flag = "Test")

    sess.close()

    return AP

我打算进行网格搜索，以获得最佳的学习率，正则化参数以及额外的权重，该权重会影响代价函数中惩罚边界框的部分。

APs = []
reg_params = np.power(2,np.linspace(-14,-6, num = 7))
LearningRates = np.power(2,np.linspace(-12,-7, num = 5))
cost_weights = np.power(2,np.linspace(2,5, num = 12))
for LR in LearningRates:
    for reg_param in reg_params:
        for cost_w in cost_weights:
            tf.reset_default_graph()
            imgRows, imgCols, n_channels = targetShape
            X = tf.placeholder("float", [None, imgRows, imgCols, n_channels])
            y = tf.placeholder("float", [None, 5])
            predictions = conv_network_model(X,35)
            AP = conv_train(cost_w,LR,reg_param)
            APs.append(AP)

但是后来我意识到，在 same 数据上使用 same 参数运行训练过程（小批量过程可确保每次训练时，所有小批时间都将是相同，即在第一个和第二次训练期间，时期1的批次1相同），AP值的结果却大不相同。我知道，将卷积与GPU配合使用会产生一定的随机性，而且初始权重可以从非常不同的位置开始。但是AP值之间的差异很大，例如，运行3次单元格后，我得到以下结果：

AP培训：0.1846-0.3488-0.42

AP测试：0.1203-0.4326-0.2254

我什至遇到过测试集上的AP接近0.6的情况，并且在运行训练过程后，它又下降到0.3的一半！

我不确定我的AP实施是否正确，是否会有这种不同的结果是正常的。

对象定位输出中使用的卷积网络的平均精度不一致

0 个答案: