我想在Tensorflow中训练vgg16模型。由于计算成本,我将Imagenet2012的验证数据集随机分为80%(每个班级40张图像)训练数据集和20%(每个班级10张图像)测试数据集。此外,我从模型keras.applications.vgg16.VGG16
的预训练权重开始训练。
这是我的问题:
似乎imagenet2012(验证)数据集具有两种不同的标签。最初,我用ILSVRC2012_devkit_t12/data/ILSVRC2012_validation_ground_truth.txt
中的标签标记了验证数据,但是预训练模型的精度几乎为零。然后,我找到另一个标签文件,而未经训练的预训练模型可以达到50%的准确性。我想知道为什么有两种标签,对于预训练模型应该使用哪种标签?
我在fc6
和fc7
之后使用两个辍学层(比率= 0.5)训练vgg16。训练数据集的准确性提高到88%,而测试数据集的准确性仅达到50%。超参数就像:
regularizer_conv = tf.contrib.layers.l2_regularizer(scale=0.002)
regularizer_fc = tf.contrib.layers.l2_regularizer(scale=0.002)
model.train(sess=session, n_epochs=20, lr=0.001)
model.train(sess=session, n_epochs=40, lr=0.0001)
model.train(sess=session, n_epochs=60, lr=0.00001)
前10个时期的训练日志如下:
epoch=1, batch=1250/1251, curr_loss=10.656956, train_acc=40.747500%, used_time:1.00s
Epoch:1, val_acc=48.150000%, val_loss=10.497776
epoch=2, batch=1250/1251, curr_loss=10.434262, train_acc=46.047500%, used_time:0.99s
Epoch:2, val_acc=47.100000%, val_loss=10.343517
epoch=3, batch=1250/1251, curr_loss=10.222428, train_acc=51.230000%, used_time:1.00s
Epoch:3, val_acc=48.910000%, val_loss=10.170202
epoch=4, batch=1250/1251, curr_loss=10.036579, train_acc=54.595000%, used_time:1.00s
Epoch:4, val_acc=47.790000%, val_loss=10.031280
epoch=5, batch=1250/1251, curr_loss=9.859999, train_acc=57.695000%, used_time:0.99s
Epoch:5, val_acc=47.360000%, val_loss=9.890789
epoch=6, batch=720/1251, curr_loss=9.716121, train_acc=61.154514%, used_time:0.99s
Epoch:6, val_acc=48.110000%, val_loss=9.753889
epoch=7, batch=1250/1251, curr_loss=9.539424, train_acc=62.747500%, used_time:1.01s
Epoch:7, val_acc=45.440000%, val_loss=9.650524
epoch=8, batch=1250/1251, curr_loss=9.394654, train_acc=64.752500%, used_time:0.99s
Epoch:8, val_acc=47.010000%, val_loss=9.510695
epoch=9, batch=1250/1251, curr_loss=9.257372, train_acc=66.647500%, used_time:0.99s
Epoch:9, val_acc=46.740000%, val_loss=9.399683
epoch=10, batch=1250/1251, curr_loss=9.127071, train_acc=68.267500%, used_time:0.99s
Epoch:10, val_acc=47.560000%, val_loss=9.283911
...
似乎泛化能力没有明显增加。那如何提高测试数据集的准确性?