以下是培训网的最后一层:
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "final"
bottom: "label"
top: "loss"
loss_param {
ignore_label: 255
normalization: VALID
}
}
注意我采用softmax_loss层。由于它的计算形式如下: - log(概率),它很奇怪,损失可能是负的,如下所示(迭代80)。
I0404 23:32:49.400624 6903 solver.cpp:228] Iteration 79, loss = 0.167006
I0404 23:32:49.400806 6903 solver.cpp:244] Train net output #0: loss = 0.167008 (* 1 = 0.167008 loss)
I0404 23:32:49.400825 6903 sgd_solver.cpp:106] Iteration 79, lr = 0.0001
I0404 23:33:25.660655 6903 solver.cpp:228] Iteration 80, loss = -1.54972e-06
I0404 23:33:25.660845 6903 solver.cpp:244] Train net output #0: loss = 0 (* 1 = 0 loss)
I0404 23:33:25.660862 6903 sgd_solver.cpp:106] Iteration 80, lr = 0.0001
I0404 23:34:00.451464 6903 solver.cpp:228] Iteration 81, loss = 1.89034
I0404 23:34:00.451661 6903 solver.cpp:244] Train net output #0: loss = 1.89034 (* 1 = 1.89034 loss)
有人可以帮我解释一下吗?这怎么可能发生? 非常感谢你!
PS: 我在这里做的任务是语义分割。 共有20个对象类加上背景(所以21个类)。标签范围从0-21。忽略额外标签225,可以在本文开头的SoftmaxWithLoss定义中找到。
答案 0 :(得分:0)
是在GPU还是CPU上运行? 打印出softmax操作后得到的prob_data:
// find the next line in your cpu or gpu Forward function
softmax_layer_->Forward(softmax_bottom_vec_, softmax_top_vec_);
// make sure you have data in cpu
const Dtype* prob_data = prob_.cpu_data();
for (int i = 0; i < prob_.count(); i++) {
printf("%f ", prob_data[i]);
}