我正在尝试为CNN实现一个残留层(使用caffe和python)。 这是残差学习的简单框图:
这是我写的代码:
def res(self,bottom,args):
'residual layer'
rp = {'negative_slope': 0}
if len(args)!=6:
raise Exception('conv requires 6 arguments: ks, stride, pad, group, nout, bias')
ks, stride, pad, group, nout, bias = [int(x) for x in args]
wf = {}
bias = bool(bias)
cp = { 'kernel_size' : [1, ks],
'stride' : [1, stride],
'pad' : [0, pad],
'group' : group,
'num_output' : nout,
'bias_term' : bias,
'axis' : 1,
'weight_filler' : { 'type': 'xavier' },
'bias_filler' : { 'type': 'constant', 'value':0.0 },
}
# multipliers for learning rate and decay of weights and bias
p = [{'lr_mult':1, 'decay_mult':1}]
if bias:
p.append({'lr_mult':2, 'decay_mult':0})
myconv1 = L.Convolution(bottom, param=p, convolution_param=cp)
rconv1 = L.ReLU(myconv1, relu_param=rp, in_place=True)
cp2 = { 'kernel_size' : [1, ks],
'stride' : [1, stride],
'pad' : [0, pad+2],
'group' : group,
'num_output' : nout,
'bias_term' : bias,
'axis' : 1,
'weight_filler' : { 'type': 'xavier' },
'bias_filler' : { 'type': 'constant', 'value':0.0 },
}
myconv2 = L.Convolution(rconv1, param=p, convolution_param=cp2)
forSum = []
forSum.append(bottom)
forSum.append(myconv2)
ep = { 'operation' : 1 }
return L.Eltwise(*forSum, eltwise_param=ep)
这是我为这个架构得到的错误c:3:1:0:1:16:0 r mp:2:2 res:3:1:0:1:16:0 r mp:2: 2 fc:20:0:
python /afs/in2p3.fr/home/n/nhatami/sps/spectroML/src/python/makeSpectroNet.py -label label -n CNN_062 -bs 10 res/2048_1e5_0.00_s/CNN_062_bs10/CNN_062_tmp/CNN_062 data/2048_1e5_0.00/2048_1e5_0.00_s c:3:1:0:1:16:0 cr mp:2:2 res:3:1:0:1:16:0 cr mp:2:2 fc:20:0
Namespace(batchSize=10, droot='data/2048_1e5_0.00/2048_1e5_0.00_s', label='label', layers=['c:3:1:0:1:16:0', 'cr', 'mp:2:2', 'res:3:1:0:1:16:0', 'cr', 'mp:2:2', 'fc:20:0'], name='CNN_062', oroot='res/2048_1e5_0.00_s/CNN_062_bs10/CNN_062_tmp/CNN_062')
data/2048_1e5_0.00/2048_1e5_0.00_s data/2048_1e5_0.00/2048_1e5_0.00_s_train_list.txt data/2048_1e5_0.00/2048_1e5_0.00_s_val_list.txt
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0208 18:00:05.952062 194649 upgrade_proto.cpp:67] Attempting to upgrade input file specified using deprecated input fields: res/2048_1e5_0.00_s/CNN_062_bs10/CNN_062_tmp/CNN_062_deploy.txt
I0208 18:00:05.952121 194649 upgrade_proto.cpp:70] Successfully upgraded file specified using deprecated input fields.
W0208 18:00:05.952126 194649 upgrade_proto.cpp:72] Note that future Caffe releases will only support input layers and not input fields.
I0208 18:00:06.349092 194649 net.cpp:51] Initializing net from parameters:
name: "CNN_062"
state {
phase: TEST
level: 0
}
layer {
name: "input"
type: "Input"
top: "data"
input_param {
shape {
dim: 1
dim: 2
dim: 1
dim: 2048
}
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 16
bias_term: false
pad: 0
pad: 0
kernel_size: 1
kernel_size: 3
group: 1
stride: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
axis: 1
}
}
layer {
name: "Scale1"
type: "Scale"
bottom: "conv1"
top: "Scale1"
param {
lr_mult: 0
decay_mult: 0
}
scale_param {
filler {
type: "constant"
value: -1
}
}
}
layer {
name: "ReLU1"
type: "ReLU"
bottom: "Scale1"
top: "ReLU1"
relu_param {
negative_slope: 0
}
}
layer {
name: "Scale2"
type: "Scale"
bottom: "ReLU1"
top: "Scale2"
param {
lr_mult: 0
decay_mult: 0
}
scale_param {
filler {
type: "constant"
value: -1
}
}
}
layer {
name: "ReLU2"
type: "ReLU"
bottom: "conv1"
top: "ReLU2"
relu_param {
negative_slope: 0
}
}
layer {
name: "crelu1"
type: "Concat"
bottom: "Scale2"
bottom: "ReLU2"
top: "crelu1"
}
layer {
name: "maxPool1"
type: "Pooling"
bottom: "crelu1"
top: "maxPool1"
pooling_param {
pool: MAX
kernel_h: 1
kernel_w: 2
stride_h: 1
stride_w: 2
pad_h: 0
pad_w: 0
}
}
layer {
name: "Convolution1"
type: "Convolution"
bottom: "maxPool1"
top: "Convolution1"
param {
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 16
bias_term: false
pad: 0
pad: 0
kernel_size: 1
kernel_size: 3
group: 1
stride: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
axis: 1
}
}
layer {
name: "ReLU3"
type: "ReLU"
bottom: "Convolution1"
top: "Convolution1"
relu_param {
negative_slope: 0
}
}
layer {
name: "Convolution2"
type: "Convolution"
bottom: "Convolution1"
top: "Convolution2"
param {
lr_mult: 1
decay_mult: 1
}
convolution_param {
num_output: 16
bias_term: false
pad: 0
pad: 2
kernel_size: 1
kernel_size: 3
group: 1
stride: 1
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
axis: 1
}
}
layer {
name: "res1"
type: "Eltwise"
bottom: "maxPool1"
bottom: "Convolution2"
top: "res1"
eltwise_param {
operation: SUM
}
}
layer {
name: "Scale3"
type: "Scale"
bottom: "res1"
top: "Scale3"
param {
lr_mult: 0
decay_mult: 0
}
scale_param {
filler {
type: "constant"
value: -1
}
}
}
layer {
name: "ReLU4"
type: "ReLU"
bottom: "Scale3"
top: "ReLU4"
relu_param {
negative_slope: 0
}
}
layer {
name: "Scale4"
type: "Scale"
bottom: "ReLU4"
top: "Scale4"
param {
lr_mult: 0
decay_mult: 0
}
scale_param {
filler {
type: "constant"
value: -1
}
}
}
layer {
name: "ReLU5"
type: "ReLU"
bottom: "res1"
top: "ReLU5"
relu_param {
negative_slope: 0
}
}
layer {
name: "crelu2"
type: "Concat"
bottom: "Scale4"
bottom: "ReLU5"
top: "crelu2"
}
layer {
name: "maxPool2"
type: "Pooling"
bottom: "crelu2"
top: "maxPool2"
pooling_param {
pool: MAX
kernel_h: 1
kernel_w: 2
stride_h: 1
stride_w: 2
pad_h: 0
pad_w: 0
}
}
layer {
name: "ampl"
type: "InnerProduct"
bottom: "maxPool2"
top: "ampl"
param {
lr_mult: 1
decay_mult: 1
}
inner_product_param {
num_output: 20
bias_term: false
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.2
}
}
}
I0208 18:00:06.349267 194649 layer_factory.hpp:77] Creating layer input
I0208 18:00:06.349287 194649 net.cpp:84] Creating Layer input
I0208 18:00:06.349298 194649 net.cpp:380] input -> data
I0208 18:00:06.349334 194649 net.cpp:122] Setting up input
I0208 18:00:06.349346 194649 net.cpp:129] Top shape: 1 2 1 2048 (4096)
I0208 18:00:06.349351 194649 net.cpp:137] Memory required for data: 16384
I0208 18:00:06.349356 194649 layer_factory.hpp:77] Creating layer conv1
I0208 18:00:06.349371 194649 net.cpp:84] Creating Layer conv1
I0208 18:00:06.349376 194649 net.cpp:406] conv1 <- data
I0208 18:00:I0208 18:00:06.349556 194649 net.cpp:380] conv1_conv1_0_split -> conv1_conv1_0_split_1
I0208 18:00:06.349568 194649 net.cpp:122] Setting up conv1_conv1_0_split
I0208 18:00:06.349575 194649 net.cpp:129] Top shape: 1 16 1 2046 (32736)
I0208 18:00:06.349580 194649 net.cpp:129] Top shape: 1 16 1 2046 (32736)
I0208 18:00:06.349583 194649 net.cpp:137] Memory required for data: 409216
I0208 18:00:06.349587 194649 layer_factory.hpp:77] Creating layer Scale1
I0208 18:00:06.349598 194649 net.cpp:84] Creating Layer Scale1
I0208 18:00:06.349603 194649 net.cpp:406] Scale1 <- conv1_conv1_0_split_0
I0208 18:00:06.349611 194649 net.cpp:380] Scale1 -> Scale1
I0208 18:00:06.349642 194649 net.cpp:122] Setting up Scale1
I0208 18:00:06.349647 194649 net.cpp:129] Top shape: 1 16 1 2046 (32736)
I0208 18:00:06.349651 194649 net.cpp:137] Memory required for data: 540160
I0208 18:00:06.349659 194649 layer_factory.hpp:77] Creating layer ReLU1
I0208 18:00:06.349668 194649 net.cpp:84] Creating Layer ReLU1
I0208 18:00:06.349673 194649 net.cpp:406] ReLU1 <- Scale1
I0208 18:00:06.349679 194649 net.cpp:380] ReLU1 -> ReLU1
I0208 18:00:06.349689 194649 net.cpp:122] Setting up ReLU1
I0208 18:00:06.349694 194649 net.cpp:129] Top shape: 1 16 1 2046 (32736)
I0208 18:00:06.349699 194649 net.cpp:137] Memory required for data: 671104
I0208 18:00:06.349702 194649 layer_factory.hpp:77] Creating layer Scale2
I0208 18:00:06.349709 194649 net.cpp:84] Creating Layer Scale2
I0208 18:00:06.349714 194649 net.cpp:406] Scale2 <- ReLU1
I0208 18:00:06.349720 194649 net.cpp:380] Scale2 -> Scale2
I0208 18:00:06.349741 194649 net.cpp:122] Setting up Scale2
I0208 18:00:06.349747 194649 net.cpp:129] Top shape: 1 16 1 2046 (32736)
I0208 18:00:06.349751 194649 net.cpp:137] Memory required for data: 802048
I0208 18:00:06.349758 194649 layer_factory.hpp:77] Creating layer ReLU2
I0208 18:00:06.349771 194649 net.cpp:84] Creating Layer ReLU2
I0208 18:00:06.349776 194649 net.cpp:406] ReLU2 <- conv1_conv1_0_split_1
I0208 18:00:06.349782 194649 net.cpp:380] ReLU2 -> ReLU2
I0208 18:00:06.349789 194649 net.cpp:122] Setting up ReLU2
I0208 18:00:06.349795 194649 net.cpp:129] Top shape: 1 16 1 2046 (32736)
I0208 18:00:06.349799 194649 net.cpp:137] Memory required for data: 932992
I0208 18:00:06.349803 194649 layer_factory.hpp:77] Creating layer crelu1
I0208 18:00:06.349812 194649 net.cpp:84] Creating Layer crelu1
I0208 18:00:06.349815 194649 net.cpp:406] crelu1 <- Scale2
I0208 18:00:06.349822 194649 net.cpp:406] crelu1 <- ReLU2
I0208 18:00:06.349829 194649 net.cpp:380] crelu1 -> crelu1
I0208 18:00:06.349843 194649 net.cpp:122] Setting up crelu1
I0208 18:00:06.349848 194649 net.cpp:129] Top shape: 1 32 1 2046 (65472)
I0208 18:00:06.349853 194649 net.cpp:137] Memory required for data: 1194880
I0208 18:00:06.349856 194649 layer_factory.hpp:77] Creating layer maxPool1
I0208 18:00:06.349864 194649 net.cpp:84] Creating Layer maxPool1
I0208 18:00:06.349870 194649 net.cpp:406] maxPool1 <- crelu1
I0208 18:00:06.349876 194649 net.cpp:380] maxPool1 -> maxPool1
I0208 18:00:06.349891 194649 net.cpp:122] Setting up maxPool1
I0208 18:00:06.349897 194649 net.cpp:129] Top shape: 1 32 1 1023 (32736)
I0208 18:00:06.349901 194649 net.cpp:137] Memory required for data: 1325824
I0208 18:00:06.349905 194649 layer_factory.hpp:77] Creating layer maxPool1_maxPool1_0_split
I0208 18:00:06.349911 194649 net.cpp:84] Creating Layer maxPool1_maxPool1_0_split
I0208 18:00:06.349915 194649 net.cpp:406] maxPool1_maxPool1_0_split <- maxPool1
I0208 18:00:06.349925 194649 net.cpp:380] maxPool1_maxPool1_0_split -> maxPool1_maxPool1_0_split_0
I0208 18:00:06.349931 194649 net.cpp:380] maxPool1_maxPool1_0_split -> maxPool1_maxPool1_0_split_1
I0208 18:00:06.349937 194649 net.cpp:122] Setting up maxPool1_maxPool1_0_split
I0208 18:00:06.349943 194649 net.cpp:129] Top shape: 1 32 1 1023 (32736)
I0208 18:00:06.349948 194649 net.cpp:129] Top shape: 1 32 1 1023 (32736)
I0208 18:00:06.349952 194649 net.cpp:137] Memory required for data: 1587712
I0208 18:00:06.349962 194649 layer_factory.hpp:77] Creating layer Convolution1
I0208 18:00:06.349973 194649 net.cpp:84] Creating Layer Convolution1
I0208 18:00:06.349983 194649 net.cpp:406] Convolution1 <- maxPool1_maxPool1_0_split_0
I0208 18:00:06.349999 194649 net.cpp:380] Convolution1 -> Convolution1
I0208 18:00:06.350034 194649 net.cpp:122] Setting up Convolution1
I0208 18:00:06.350040 194649 net.cpp:129] Top shape: 1 16 1 1021 (16336)
I0208 18:00:06.350044 194649 net.cpp:137] Memory required for data: 1653056
I0208 18:00:06.350050 194649 layer_factory.hpp:77] Creating layer ReLU3
I0208 18:00:06.350056 194649 net.cpp:84] Creating Layer ReLU3
I0208 18:00:06.350061 194649 net.cpp:406] ReLU3 <- Convolution1
I0208 18:00:06.350067 194649 net.cpp:367] ReLU3 -> Convolution1 (in-place)
I0208 18:00:06.350075 194649 net.cpp:122] Setting up ReLU3
I0208 18:00:06.350080 194649 net.cpp:129] Top shape: 1 16 1 1021 (16336)
I0208 18:00:06.350083 194649 net.cpp:137] Memory required for data: 1718400
I0208 18:00:06.350087 194649 layer_factory.hpp:77] Creating layer Convolution2
I0208 18:00:06.350095 194649 net.cpp:84] Creating Layer Convolution2
I0208 18:00:06.350100 194649 net.cpp:406] Convolution2 <- Convolution1
I0208 18:00:06.350108 194649 net.cpp:380] Convolution2 -> Convolution2
I0208 18:00:06.350132 194649 net.cpp:122] Setting up Convolution2
I0208 18:00:06.350138 194649 net.cpp:129] Top shape: 1 16 1 1023 (16368)
I0208 18:00:06.350142 194649 net.cpp:137] Memory required for data: 1783872
I0208 18:00:06.350149 194649 layer_factory.hpp:77] Creating layer res1
I0208 18:00:06.350158 194649 net.cpp:84] Creating Layer res1
I0208 18:00:06.350163 194649 net.cpp:406] res1 <- maxPool1_maxPool1_0_split_1
I0208 18:00:06.350168 194649 net.cpp:406] res1 <- Convolution2
I0208 18:00:06.350178 194649 net.cpp:380] res1 -> res1
F0208 18:00:06.350195 194649 eltwise_layer.cpp:34] Check failed: bottom[0]->shape() == bottom[i]->shape() bottom[0]: 1 32 1 1023 (32736), bottom[1]: 1 16 1 1023 (16368)
*** Check failure stack trace: ***
336,1 63%
71,1 5%
我真的很感谢你的帮助!
答案 0 :(得分:3)
关于残差块的棘手问题是x
和F(x)
必须具有相同的形状,否则您无法总结它们:x + F(x)
。
在您的示例中,似乎x
具有维度32
,而F(x)
具有维度16
在F(x)
的维度与x
的维度不同的情况下,通常会在残差链接上放置1x1转化图层:
- 当stride!=1
(空间维度不同)时
- 更改频道数量时(通常在新的&#34;块&#34;在resnet中)