我正在尝试用lfw对数据集训练深度卷积神经网络(2200对面,1100属于同一个人,1100不属于同一个人)。问题在于,虽然在训练期间损失正在减少,但与第一个时期相比,训练数据的准确度仍然相同甚至变得更差。我使用相当低的学习率,这是我得到的0.0001:
Epoch 0 training complete
Loss: 0.10961
Accuracy: 0.549
Epoch 1 training complete
Loss: 0.10671
Accuracy: 0.554
Epoch 2 training complete
Loss: 0.10416
Accuracy: 0.559
Epoch 3 training complete
Loss: 0.10152
Accuracy: 0.553
Epoch 4 training complete
Loss: 0.09854
Accuracy: 0.563
Epoch 5 training complete
Loss: 0.09693
Accuracy: 0.565
Epoch 6 training complete
Loss: 0.09473
Accuracy: 0.563
Epoch 7 training complete
Loss: 0.09250
Accuracy: 0.566
Epoch 8 training complete
Loss: 0.09137
Accuracy: 0.565
这就是我获得的0.0005学习率:
Epoch 0 training complete
Loss: 0.09443
Accuracy: 0.560
Epoch 1 training complete
Loss: 0.08151
Accuracy: 0.565
Epoch 2 training complete
Loss: 0.07635
Accuracy: 0.560
Epoch 3 training complete
Loss: 0.07394
Accuracy: 0.560
Epoch 4 training complete
Loss: 0.07183
Accuracy: 0.555
Epoch 5 training complete
Loss: 0.06996
Accuracy: 0.563
Epoch 6 training complete
Loss: 0.06878
Accuracy: 0.556
Epoch 7 training complete
Loss: 0.06743
Accuracy: 0.538
Epoch 8 training complete
Loss: 0.06689
Accuracy: 0.538
Epoch 9 training complete
Loss: 0.06680
Accuracy: 0.549
Epoch 10 training complete
Loss: 0.06559
Accuracy: 0.542
所以我的模型在Tensorflow中实现。这是网络架构:
def _get_output_ten(self, inputs_ph, embedding_dimension):
with tf.variable_scope(self.var_scope, reuse=self.net_vars_created):
if self.net_vars_created is None:
self.net_vars_created = True
inputs = tf.reshape(inputs_ph, [-1, self.width, self.height, 1])
weights_init = random_normal_initializer(mean=0.0, stddev=0.1)
# returns 60 x 60 x 15
net = tf.layers.conv2d(
inputs=inputs,
filters=15,
kernel_size=(5, 5),
strides=1,
padding='valid',
kernel_initializer=weights_init,
activation=tf.nn.relu)
# returns 30 x 30 x 15
net = tf.layers.max_pooling2d(inputs=net, pool_size=(2, 2), strides=2)
# returns 24 x 24 x 45
net = tf.layers.conv2d(
inputs=net,
filters=45,
kernel_size=(7, 7),
strides=1,
padding='valid',
kernel_initializer=weights_init,
activation=tf.nn.relu)
# returns 6 x 6 x 45
net = tf.layers.max_pooling2d(inputs=net, pool_size=(4, 4), strides=4)
# returns 1 x 1 x 250
net = tf.layers.conv2d(
inputs=net,
filters=250,
kernel_size=(6, 6),
strides=1,
kernel_initializer=weights_init,
activation=tf.nn.relu)
net = tf.reshape(net, [-1, 1 * 1 * 250])
net = tf.layers.dense(
inputs=net,
units=256,
kernel_initializer=weights_init,
activation=tf.nn.sigmoid)
net = tf.layers.dense(
inputs=net,
units=embedding_dimension,
kernel_initializer=weights_init,
activation=tf.nn.sigmoid)
net = tf.check_numerics(net, message='model')
return net
我尝试了更深入的训练,但无论训练多久,他们在所有时期都能提供0.500左右的训练准确度。 我正在使用暹罗建筑和对比度损失功能。这就是培训的实施方式:
def train(self, x1s, x2s, ys, num_epochs, mini_batch_size, learning_rate, embedding_dimension, margin,
monitor_training_loss=False, monitor_training_accuracy=False):
input1_ph = tf.placeholder(dtype=tf.float32, shape=(mini_batch_size, self.width, self.height))
input2_ph = tf.placeholder(dtype=tf.float32, shape=(mini_batch_size, self.width, self.height))
labels_ph = tf.placeholder(dtype=tf.int32, shape=(mini_batch_size,))
output1 = self._get_output_ten(input1_ph, embedding_dimension)
output2 = self._get_output_ten(input2_ph, embedding_dimension)
loss = self._get_loss_op(output1, output2, labels_ph, margin)
loss = tf.Print(loss, [loss], message='loss')
global_step = tf.Variable(initial_value=0, trainable=False, name='global_step')
train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss, global_step=global_step)
num_batches = int(math.ceil(ys.shape[0] / mini_batch_size))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for ep in range(num_epochs):
x1s, x2s, ys = unison_shuffle([x1s, x2s, ys], ys.shape[0])
for bt_num in range(num_batches):
bt_slice = slice(bt_num * mini_batch_size, (bt_num + 1) * mini_batch_size)
sess.run(train_op, feed_dict={
input1_ph: x1s[bt_slice],
input2_ph: x2s[bt_slice],
labels_ph: ys[bt_slice]
})
print('Epoch {} training complete'.format(ep))
这就是计算损失的方法:
def _get_loss_op(output1, output2, labels, margin):
labels = tf.to_float(labels)
d_sqr = compute_euclidian_distance_square(output1, output2)
loss_non_reduced = labels * d_sqr + (1 - labels) * tf.square(tf.maximum(0., margin - d_sqr))
return 0.5 * tf.reduce_mean(tf.cast(loss_non_reduced, dtype=tf.float64))
这就是我衡量准确度的方法:
def _get_accuracy_op(out1, out2, labels, margin):
distances = tf.sqrt(compute_euclidian_distance_square(out1, out2))
gt_than_margin = tf.cast(tf.maximum(tf.subtract(distances, margin), 0.0), dtype=tf.bool)
predictions = tf.cast(gt_than_margin, dtype=tf.int32)
return tf.reduce_mean(tf.cast(tf.not_equal(predictions, labels), dtype=tf.float32))
我使用0.5保证金和50的迷你批量。 渐变监测什么都没有,似乎没问题。我还监控了嵌入结果的距离,看起来它们没有按照正确的方向更新。
这是包含完整源代码https://github.com/andrei-papou/facever的repo。它不是那么大,请检查我是否在这里没有提供足够的信息。
全部谢谢!