dlib dnn同步似乎没有存储所有内容

时间:2017-05-24 07:47:48

标签: c++ dlib

我注意到使用trainer.set_synchronization_file()的一个奇怪的行为。 据我所知(根据trainer_abstract.h和trainer.h中的实现),它保存了培训师的当前状态(包括网),以便我们可以在某个时刻停止培训并从完全相同的步骤重新开始。 但是在使用示例dnn_metric_learning_on_images_ex.cpp时(来自主站b4a54490783)。如果我停止训练(例如ctrl + c)然后重新启动,则损失会显着减少,就好像某个动量被重置一样,并且SGD正在寻找更好的改进途径。 任何人都有想法? 这是一些代码示例。同步后仅更新停止条件。

dnn_trainer<net_type> trainer(net, sgd(0.0001, 0.9), {1,0});
trainer.set_learning_rate(0.1);
trainer.be_verbose();
trainer.set_synchronization_file("face_metric_sync", std::chrono::minutes(5));
trainer.set_iterations_without_progress_threshold(10000);
// data loaders (...)
while(trainer.get_learning_rate() >= 1e-5)
{
    qimages.dequeue(images);
    qlabels.dequeue(labels);
    trainer.train_one_step(images, labels);
}

以下是停止前的状态示例:

Saved state to face_metric_sync
step#: 198726  learning rate: 0.001  average loss: 0.00928623   steps without apparent progress: 5923
step#: 198844  learning rate: 0.001  average loss: 0.00950317   steps without apparent progress: 6183
step#: 198963  learning rate: 0.001  average loss: 0.00971744   steps without apparent progress: 6525
step#: 199082  learning rate: 0.001  average loss: 0.00917967   steps without apparent progress: 6681
step#: 199200  learning rate: 0.001  average loss: 0.00942927   steps without apparent progress: 6834
step#: 199319  learning rate: 0.001  average loss: 0.00938926   steps without apparent progress: 6941
step#: 199438  learning rate: 0.001  average loss: 0.00917057   steps without apparent progress: 6915
Saved state to face_metric_sync
step#: 199552  learning rate: 0.001  average loss: 0.00964872   steps without apparent progress: 7487
^C

重新启动后

objs.size(): 75656
step#: 199507  learning rate: 0.001  average loss: 0.00974885   steps without apparent progress: 7146
step#: 199654  learning rate: 0.001  average loss: 0.00720691   steps without apparent progress: 64
step#: 199812  learning rate: 0.001  average loss: 0.00687095   steps without apparent progress: 209
step#: 199970  learning rate: 0.001  average loss: 0.00705782   steps without apparent progress: 439
step#: 200128  learning rate: 0.001  average loss: 0.00690515   steps without apparent progress: 584

0 个答案:

没有答案