关于loss layers的Caffe文档说我可以将任何图层上的loss_weight参数设置为0.这将否定图层对损失函数的任何贡献。但是,当我输入两个不同的损耗层并将其中一个的loss_weight设置为零时,网络将完全停止训练,即使有两个丢失层可供使用。
我在MNIST数据集上用LeNet进行说明。对LeNet .prototxt进行了两处更改:
1)将loss_weight参数添加到SoftmaxWithLoss图层,将损失权重设置为0.5。 2)重复更改的SoftmaxWithLoss图层,创建两个相同的损失图层,两个损失权重均为0.5。
编辑:这是两层的Caffe输出,其中loss_weights为0.5:
I0729 01:06:58.008580 1336 solver.cpp:397] Test net output #0: accuracy = 0.0024
I0729 01:06:58.008597 1336 solver.cpp:397] Test net output #1: loss = 4.68486 (* 0.5 = 2.34243 loss)
I0729 01:06:58.008600 1336 solver.cpp:397] Test net output #2: loss2 = 4.68486 (* 0.5 = 2.34243 loss)
I0729 01:06:58.011636 1336 solver.cpp:218] Iteration 0 (0 iter/s, 0.102775s/100 iters), loss = 4.72888
I0729 01:06:58.011657 1336 solver.cpp:237] Train net output #0: loss = 4.72888 (* 0.5 = 2.36444 loss)
I0729 01:06:58.011662 1336 solver.cpp:237] Train net output #1: loss2 = 4.72888 (* 0.5 = 2.36444 loss)
I0729 01:06:58.011669 1336 sgd_solver.cpp:105] Iteration 0, lr = 0.01
I0729 01:06:58.035707 1336 solver.cpp:330] Iteration 10, Testing net (#0)
I0729 01:06:58.035722 1336 net.cpp:676] Ignoring source layer data
I0729 01:06:58.035724 1336 net.cpp:676] Ignoring source layer label_data_1_split
I0729 01:06:58.128404 1343 data_layer.cpp:73] Restarting data prefetching from start.
I0729 01:06:58.131825 1336 solver.cpp:397] Test net output #0: accuracy = 0.4744
I0729 01:06:58.131842 1336 solver.cpp:397] Test net output #1: loss = 1.70187 (* 0.5 = 0.850937 loss)
I0729 01:06:58.131846 1336 solver.cpp:397] Test net output #2: loss2 = 1.70187 (* 0.5 = 0.850937 loss)
I0729 01:06:58.155300 1336 solver.cpp:330] Iteration 20, Testing net (#0)
I0729 01:06:58.155344 1336 net.cpp:676] Ignoring source layer data
I0729 01:06:58.155352 1336 net.cpp:676] Ignoring source layer label_data_1_split
I0729 01:06:58.249866 1343 data_layer.cpp:73] Restarting data prefetching from start.
I0729 01:06:58.253347 1336 solver.cpp:397] Test net output #0: accuracy = 0.5036
I0729 01:06:58.253365 1336 solver.cpp:397] Test net output #1: loss = 1.47916 (* 0.5 = 0.739582 loss)
I0729 01:06:58.253368 1336 solver.cpp:397] Test net output #2: loss2 = 1.47916 (* 0.5 = 0.739582 loss)
训练完全正常进行,性能与未修改的.prototxt相同;在前一百次左右的迭代中,准确度迅速上升到0.90,然后在.98左右达到平稳。
然后,当其中一个loss_weights设置为0而另一个设置为1时,模型完全停止训练,精度为0,损失恒定为87.3365。
这是caffe .prototxt中添加的图层的样子;其他一切都与基本的lenet .prototxt相同。
layer {
name: "loss2"
type: "SoftmaxWithLoss"
bottom: "ip_s2"
bottom: "label"
top: "loss2"
loss_weight: 0
}
以下是相关的caffe输出:
I0729 00:45:06.989990 1318 solver.cpp:218] Iteration 0 (-0.00385461 iter/s, 0.0972042s/100 iters), loss = 4.75995
I0729 00:45:06.990005 1318 solver.cpp:237] Train net output #0: loss = 4.75995 (* 1 = 4.75995 loss)
I0729 00:45:06.990010 1318 solver.cpp:237] Train net output #1: loss2 = 4.75995
I0729 00:45:06.990020 1318 sgd_solver.cpp:105] Iteration 0, lr = 0.01
I0729 00:45:07.013304 1318 solver.cpp:330] Iteration 10, Testing net (#0)
I0729 00:45:07.013320 1318 net.cpp:676] Ignoring source layer data
I0729 00:45:07.013322 1318 net.cpp:676] Ignoring source layer label_data_1_split
I0729 00:45:07.071669 1318 blocking_queue.cpp:49] Waiting for data
I0729 00:45:07.107491 1325 data_layer.cpp:73] Restarting data prefetching from start.
I0729 00:45:07.112396 1318 solver.cpp:397] Test net output #0: accuracy = 0
I0729 00:45:07.112416 1318 solver.cpp:397] Test net output #1: loss = 87.3365 (* 1 = 87.3365 loss)
I0729 00:45:07.112419 1318 solver.cpp:397] Test net output #2: loss2 = 87.3365
I0729 00:45:07.134907 1318 solver.cpp:330] Iteration 20, Testing net (#0)
I0729 00:45:07.134958 1318 net.cpp:676] Ignoring source layer data
I0729 00:45:07.134965 1318 net.cpp:676] Ignoring source layer label_data_1_split
I0729 00:45:07.223928 1325 data_layer.cpp:73] Restarting data prefetching from start.
I0729 00:45:07.226883 1318 solver.cpp:397] Test net output #0: accuracy = 0
I0729 00:45:07.226912 1318 solver.cpp:397] Test net output #1: loss = 87.3365 (* 1 = 87.3365 loss)
I0729 00:45:07.226920 1318 solver.cpp:397] Test net output #2: loss2 = 87.3365
与未编辑的lenet原型的caffe输出相比:
I0729 00:51:35.134851 1327 solver.cpp:397] Test net output #0: accuracy = 0.0287
I0729 00:51:35.134869 1327 solver.cpp:397] Test net output #1: loss = 4.71456 (* 1 = 4.71456 loss)
I0729 00:51:35.137794 1327 solver.cpp:218] Iteration 0 (1.68966e+31 iter/s, 0.0944882s/100 iters), loss = 4.70328
I0729 00:51:35.137809 1327 solver.cpp:237] Train net output #0: loss = 4.70328 (* 1 = 4.70328 loss)
I0729 00:51:35.137821 1327 sgd_solver.cpp:105] Iteration 0, lr = 0.01
I0729 00:51:35.159623 1327 solver.cpp:330] Iteration 10, Testing net (#0)
I0729 00:51:35.159639 1327 net.cpp:676] Ignoring source layer data
I0729 00:51:35.249256 1334 data_layer.cpp:73] Restarting data prefetching from start.
I0729 00:51:35.252557 1327 solver.cpp:397] Test net output #0: accuracy = 0.3437
I0729 00:51:35.252575 1327 solver.cpp:397] Test net output #1: loss = 1.76758 (* 1 = 1.76758 loss)
I0729 00:51:35.275521 1327 solver.cpp:330] Iteration 20, Testing net (#0)
I0729 00:51:35.275537 1327 net.cpp:676] Ignoring source layer data
I0729 00:51:35.361196 1334 data_layer.cpp:73] Restarting data prefetching from start.
I0729 00:51:35.364228 1327 solver.cpp:397] Test net output #0: accuracy = 0.7281
I0729 00:51:35.364244 1327 solver.cpp:397] Test net output #1: loss = 0.926002 (* 1 = 0.926002 loss)
I0729 00:51:35.386831 1327 solver.cpp:330] Iteration 30, Testing net (#0)
I0729 00:51:35.386847 1327 net.cpp:676] Ignoring source layer data
I0729 00:51:35.468611 1334 data_layer.cpp:73] Restarting data prefetching from start.
I0729 00:51:35.471771 1327 solver.cpp:397] Test net output #0: accuracy = 0.7942
I0729 00:51:35.471804 1327 solver.cpp:397] Test net output #1: loss = 0.654145 (* 1 = 0.654145 loss)
是否有人能够复制错误或解释发生了什么以及如何解决此问题?
谢谢!