我正在尝试估算单个图像的深度。为此,我有一个图像和一个ground_truth = depth_image和一个非常简单的完全卷积网络。我的图像和深度图像都具有1x227x227的形状
我有一个像这样的 train_val.prototxt :
layer {
name: "train-data"
type: "Data"
top: "data"
include {
phase: TRAIN
}
transform_param {
mean_file: "mean_train.binaryproto"
}
data_param {
source: ".../train_lmdb"
batch_size: 4
backend: LMDB
}
}
layer {
name: "train-depth"
type: "Data"
top: "depth"
include {
phase: TRAIN
}
transform_param {
mean_file: "mean_train.binaryproto"
}
data_param {
source: ".../train_depth_lmdb"
batch_size: 4
backend: LMDB
}
}
layer {
name: "val-data"
type: "Data"
top: "data"
include {
phase: TEST
}
transform_param {
mean_file: "mean_val.binaryproto"
}
data_param {
source: ".../val_lmdb"
batch_size: 4
backend: LMDB
}
}
layer {
name: "val-depth"
type: "Data"
top: "depth"
include {
phase: TEST
}
transform_param {
mean_file: "mean_train.binaryproto"
}
data_param {
source: ".../val_depth_lmdb"
batch_size: 4
backend: LMDB
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 16
kernel_size: 17
stride: 1
pad: 8
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "conv2"
type: "Convolution"
bottom: "conv1"
top: "conv2"
convolution_param {
num_output: 16
kernel_size: 15
stride: 1
pad: 7
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "conv3"
type: "Convolution"
bottom: "conv2"
top: "conv3"
convolution_param {
num_output: 32
kernel_size: 11
stride: 1
pad: 5
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 32
kernel_size: 9
stride: 1
pad: 4
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 32
kernel_size: 9
stride: 1
pad: 4
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "conv6"
type: "Convolution"
bottom: "conv5"
top: "conv6"
convolution_param {
num_output: 64
kernel_size: 3
stride: 1
pad: 1
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "conv6"
top: "conv6"
}
layer {
name: "conv7"
type: "Convolution"
bottom: "conv6"
top: "conv7"
convolution_param {
num_output: 1
kernel_size: 1
stride: 1
pad: 0
}
}
layer {
name: "loss"
type: "EuclideanLoss"
bottom: "conv7"
bottom: "depth"
top: "loss"
}
这是我的日志文件。正如您所看到的,损失值太高了。我不知道为什么。我已经改变了学习率,但这似乎不是问题所在。有什么想法吗?
I1101 12:26:48.211413 14410 solver.cpp:404] Test net output #0: loss = 1.75823e+08 (* 1 = 1.75823e+08 loss)
I1101 12:26:48.231112 14410 solver.cpp:228] Iteration 0, loss = 1.08477e+08
I1101 12:26:48.231155 14410 solver.cpp:244] Train net output #0: loss = 1.08477e+08 (* 1 = 1.08477e+08 loss)
I1101 12:26:48.231180 14410 sgd_solver.cpp:106] Iteration 0, lr = 0.001
I1101 12:26:48.275223 14410 solver.cpp:228] Iteration 1, loss = 1.34519e+08
I1101 12:26:48.275249 14410 solver.cpp:244] Train net output #0: loss = 1.34519e+08 (* 1 = 1.34519e+08 loss)
I1101 12:26:48.275254 14410 sgd_solver.cpp:106] Iteration 1, lr = 0.001
I1101 12:26:48.313233 14410 solver.cpp:228] Iteration 2, loss = 1.57773e+08
I1101 12:26:48.313277 14410 solver.cpp:244] Train net output #0: loss = 1.57773e+08 (* 1 = 1.57773e+08 loss)
I1101 12:26:48.313282 14410 sgd_solver.cpp:106] Iteration 2, lr = 0.001
I1101 12:26:48.349695 14410 solver.cpp:228] Iteration 3, loss = 1.12463e+08
I1101 12:26:48.349742 14410 solver.cpp:244] Train net output #0: loss = 1.12463e+08 (* 1 = 1.12463e+08 loss)
...
I1101 12:29:00.106989 14410 solver.cpp:317] Iteration 3390, loss = 1.21181e+08
I1101 12:29:00.107023 14410 solver.cpp:337] Iteration 3390, Testing net (#0)
I1101 12:29:00.107029 14410 net.cpp:693] Ignoring source layer train-data
I1101 12:29:00.107033 14410 net.cpp:693] Ignoring source layer train-depth
I1101 12:29:00.294692 14410 solver.cpp:404] Test net output #0: loss = 1.7288e+08 (* 1 = 1.7288e+08 loss)
I1101 12:29:00.294737 14410 solver.cpp:322] Optimization Done.
I1101 12:29:00.294741 14410 caffe.cpp:254] Optimization Done.
为了好玩,我已经将标签数据设置为与输入数据完全相同的数据,这意味着它将学习图像本身但输出仍然相同。必须有一些完全错误的东西?