我正在使用张量流训练我自己的对象定位系统,输入的是形状为(1170,765, 1)
的图像,并且系统只有一个目标类别,我的输出向量y_hat
是如下所示:[P, x, y, w, h]
,其中P表示图像中对象存在的概率,x,y,w,h
是边界框的坐标和尺寸。
考虑到我的成本函数是:(y_1)(MSE(y,y_hat)) + (1-y_1)(MSE(y_1,y_hat_1))
(摘自Andrew Ng的深度学习专业知识。)
我正在按照以下例程计算AP分数:
def AP(y,y_hat):
y_scores = y_hat[:,0]
y_hat_bbox = y_hat[:,1:]
y_bbox = y[:,1:]
y_true = np.zeros(y_scores.shape[0])
for i in range(y_true.shape[0]):
y_true[i] = iou(get_corners(y_hat_bbox[i]), get_corners(y_bbox[i]), True)
AP = average_precision_score(y_true = y_true, y_score = y_scores)
return AP
average_precision_score
为the one provided by scikit-Learn,如果iou值大于给定阈值,则iou基本返回 1.0 ,否则返回 0.0 >
我正在使用3个卷积层,其中两个的步幅为2,随后是三个最大的池化层,步幅为4(这是为了快速减小输入数据的大小)最后是35个完全连接的神经元,最后是5个神经元输出层。
这是训练功能:
def conv_train(costW,LRates,RegParam):
costs = []
n_samples = X_train.shape[0]
varsnames = tf.trainable_variables()
lossL2 = tf.add_n([ tf.nn.l2_loss(v) for v in varsnames if 'bias' not in v.name]) * RegParam
cost_pos = costW*tf.multiply( y[:,0],tf.reduce_mean(tf.squared_difference(predictions,y),1))
cost_neg = tf.multiply(tf.subtract(1.0,y[:,0]),tf.squared_difference(predictions[:,0],y[:,0]))
cost = tf.reduce_mean(tf.add(cost_pos,cost_neg)) + lossL2
optimizer = tf.train.AdamOptimizer(learning_rate = LRates).minimize(cost)
tf.set_random_seed(1)
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
saver = tf.train.Saver()
seed = 0
for epoch in range(training_epochs):
epoch_cost = 0.
num_minibatches = int(n_samples / minibatch_size)
print("Epoca " + str(seed+1))
print(datetime.datetime.time(datetime.datetime.now()))
seed = seed + 1
minibatches = random_mini_batches(X_train, y_train,mini_batch_size=minibatch_size, seed=seed)
for minibatch in minibatches:
(minibatch_X, minibatch_y) = minibatch
_ , minibatch_cost = sess.run([optimizer, cost], feed_dict={X: minibatch_X, y: minibatch_y})
epoch_cost += minibatch_cost / num_minibatches
print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
sys.stdout.flush()
costs.append(epoch_cost)
saver.save(sess, logPath + "/Model/model.ckpt")
plot_cost(costs)
test_results = predictions.eval({X: X_test})
train_result = predictions.eval({X: X_train})
AP_test = write_log(y = y_train, y_hat = train_result, flag = "Train")
AP = write_log(y = y_test, y_hat = test_results, flag = "Test")
sess.close()
return AP
我打算进行网格搜索,以获得最佳的学习率,正则化参数以及额外的权重,该权重会影响代价函数中惩罚边界框的部分。
APs = []
reg_params = np.power(2,np.linspace(-14,-6, num = 7))
LearningRates = np.power(2,np.linspace(-12,-7, num = 5))
cost_weights = np.power(2,np.linspace(2,5, num = 12))
for LR in LearningRates:
for reg_param in reg_params:
for cost_w in cost_weights:
tf.reset_default_graph()
imgRows, imgCols, n_channels = targetShape
X = tf.placeholder("float", [None, imgRows, imgCols, n_channels])
y = tf.placeholder("float", [None, 5])
predictions = conv_network_model(X,35)
AP = conv_train(cost_w,LR,reg_param)
APs.append(AP)
但是后来我意识到,在 same 数据上使用 same 参数运行训练过程(小批量过程可确保每次训练时,所有小批时间都将是相同,即在第一个和第二次训练期间,时期1的批次1相同),AP值的结果却大不相同。 我知道,将卷积与GPU配合使用会产生一定的随机性,而且初始权重可以从非常不同的位置开始。但是AP值之间的差异很大,例如,运行3次单元格后,我得到以下结果:
AP培训:0.1846-0.3488-0.42
AP测试:0.1203-0.4326-0.2254
我什至遇到过测试集上的AP接近0.6的情况,并且在运行训练过程后,它又下降到0.3的一半!
我不确定我的AP实施是否正确,是否会有这种不同的结果是正常的。