我已经训练了一个图像去模糊网络。我使用重建图像和groundTruth图像之间的欧几里德损失作为我的损失函数。对于尺寸为249x249x3(地面实况和重建图像的大小)的图像,我得到的欧几里德损失约为3.0。
我使用了一个尺寸为11x11的内核的gaussianBlurr来去除精细图像的去模糊。我使用了大约15k的训练图像。我的最终卷积层有num_output:3。因此它产生三个特征映射。我想从这三个滤镜的输出中生成RGB图像。
我无法从最终卷积层的输出构建图像。
到目前为止,我有以下代码用于重建最终图像,但此代码无法生成良好的图像。
inputBlob = net.blobs.keys()[0]
outputBlob = net.blobs.keys()[-1]
print inputBlob
print outputBlob
out = net.blobs[outputBlob].data
print out.shape
# print type(out)
# print out
out = out.reshape(out.shape[1], out.shape[2], out.shape[3])
print out.shape
out = out.transpose(1, 2, 0)
print out.shape
# out /= 0.004
# out[:,:,0] += 103.939
# out[:,:,1] += 116.779
# out[:,:,2] += 123.68
# print out
# print type(out)
# print out.shape
scipy.misc.imsave('out.jpg', out)
评论的代码包括调试步骤。而且我也试图对图像进行非规范化并添加平均值。但它也没有给出任何好结果。我尝试了这一步,因为在将输入转发到网络之前,我确实意味着减法和规范化。
任何链接,帮助和建议将不胜感激。以下是我的train_val.prototxt文件:
name: "DeblurrNet"
layer {
name: "data"
type: "Data"
top: "data"
data_param {
source: "/home/gpu/Programs/Dharma/DeblurrNet/codes/train_lmdb"
batch_size: 1
backend: LMDB
}
transform_param {
mean_file: "/home/gpu/Programs/Dharma/DeblurrNet/data/mean.binaryproto"
scale: 0.004
crop_size: 255
# fixed_crop: true
}
include: { phase: TRAIN }
}
layer {
name: "labels"
type: "Data"
top: "labels"
data_param {
source: "/home/gpu/Programs/Dharma/DeblurrNet/codes/train_label_lmdb"
batch_size: 1
backend: LMDB
}
transform_param {
mean_file: "/home/gpu/Programs/Dharma/DeblurrNet/data/label_mean.binaryproto"
scale: 0.004
crop_size: 249
# fixed_crop: true
}
include: { phase: TRAIN }
}
layer {
name: "data"
type: "Data"
top: "data"
data_param {
source: "/home/gpu/Programs/Dharma/DeblurrNet/codes/val_lmdb"
backend: LMDB
batch_size: 1
}
transform_param {
mean_file: "/home/gpu/Programs/Dharma/DeblurrNet/data/mean.binaryproto"
scale: 0.004
crop_size: 255
# fixed_crop: true
}
include: { phase: TEST }
}
layer {
name: "labels"
type: "Data"
top: "labels"
data_param {
source: "/home/gpu/Programs/Dharma/DeblurrNet/codes/val_label_lmdb"
backend: LMDB
batch_size: 1
}
transform_param {
mean_file: "/home/gpu/Programs/Dharma/DeblurrNet/data/label_mean.binaryproto"
scale: 0.004
crop_size: 249
# fixed_crop: true
}
include: { phase: TEST }
}
layer {
name: "CONVX_1"
type: "Convolution"
bottom: "data"
top: "CONVX_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 5
num_output: 128
stride: 1
pad: 0
# group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "RELU_1"
type: "ReLU"
bottom: "CONVX_1"
top: "CONVX_1"
}
layer {
name: "CONVX_2"
type: "Convolution"
bottom: "CONVX_1"
top: "CONVX_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 1
num_output: 128
stride: 1
pad: 0
# group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "RELU_2"
type: "ReLU"
bottom: "CONVX_2"
top: "CONVX_2"
}
layer {
name: "CONVX_3"
type: "Convolution"
bottom: "CONVX_2"
top: "CONVX_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 1
num_output: 128
stride: 1
pad: 0
# group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "RELU_3"
type: "ReLU"
bottom: "CONVX_3"
top: "CONVX_3"
}
layer {
name: "CONVX_4"
type: "Convolution"
bottom: "CONVX_3"
top: "CONVX_4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 1
num_output: 128
stride: 1
pad: 0
# group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "RELU_4"
type: "ReLU"
bottom: "CONVX_4"
top: "CONVX_4"
}
layer {
name: "CONVX_5"
type: "Convolution"
bottom: "CONVX_4"
top: "CONVX_5"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 1
num_output: 128
stride: 1
pad: 0
# group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "RELU_5"
type: "ReLU"
bottom: "CONVX_5"
top: "CONVX_5"
}
layer {
name: "CONVX_6"
type: "Convolution"
bottom: "CONVX_5"
top: "CONVX_6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 3
num_output: 64
stride: 1
pad: 0
# group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "RELU_6"
type: "ReLU"
bottom: "CONVX_6"
top: "CONVX_6"
}
layer {
name: "CONVX_7"
type: "Convolution"
bottom: "CONVX_6"
top: "CONVX_7"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 1
num_output: 16
stride: 1
pad: 0
# group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "RELU_7"
type: "ReLU"
bottom: "CONVX_7"
top: "CONVX_7"
}
layer {
name: "CONVX_8"
type: "Convolution"
bottom: "CONVX_7"
top: "CONVX_8"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
kernel_size: 1
num_output: 3
stride: 1
pad: 0
# group: 4
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss"
type: "EuclideanLoss"
top: "loss"
bottom: "CONVX_8"
bottom: "labels"
}
对于LMDB的创建,我使用了caffe中imagenet示例中的create_lmdbs.sh文件。我关掉了洗牌,因为我需要一对而不需要改组输入和标签。
这里的第一个是原始图像,第二个是模糊版本,第三个是来自CNN的重建图像。
答案 0 :(得分:1)
您的输入图层会进行裁剪:
transform_param {
mean_file: "/home/gpu/Programs/Dharma/DeblurrNet/data/mean.binaryproto"
scale: 0.004
crop_size: 255 # <---- crop
}
如果您认真阅读 comment in caffe.proto
,则说明
指定我们是否希望随机裁剪图片。
现在这是你网上发生的事情,你有一个随机裁剪到255x255的输入图像,而另一方面,你有一个“干净”(地面真相)图像, 随机裁剪为249x249。这两种作物之间没有相关性!您也可以使用无噪声地面实况图像的右下部分获得噪声输入图像的左上部分。您希望您的网络在这些设置下学习什么?