Question

我尝试将Caffe网络和模型（权重）迁移到tensorflow。原始的第一层定义如最后所示，这是1x128x128灰度图像上的步长一卷，内核大小为5x5，输出通道为96.

按照以下步骤，我将权重从caffemodel文件转换为numpy array：

net = caffe.Net(model, caffe.TEST);
net.copy_from(weights);
weights = net.params[name][0].data
bais    = net.params[name][1].data
if "fc" in name:
    weights = weights.transpose()#2D
elif "conv" in name:
    weights = weights.transpose(2, 3, 1, 0)

Caffe weights shape:(96, 1, 5, 5),biases shape:(96,)。转置后，新的'weights shape:', (5, 5, 1, 96), 'biases shape:', (96,)数组用于初始化tensorflow过滤器。

tensorflow代码如下：

gray = tf.reduce_mean(images, axis=3, keep_dims=True)
self.gray = gray
conv1 = self._conv_layer(gray, name='conv1')

def _conv_layer(self, input_, output_dim=96,
           k_h=3, k_w=3, d_h=1, d_w=1, stddev=0.02,
           name="conv2d"):
    #Note: currently kernel size and input output channel num are decided by loaded filter weights.
    #only strides are decided by calling param.
    with tf.variable_scope(name) as scope:
        filt = self.get_conv_filter(name)
        conv = tf.nn.conv2d(input_, filt, strides=[1, d_h, d_w, 1], padding='SAME')
        conv_biases = self.get_bias(name)
        return tf.nn.bias_add(conv, conv_biases)

def get_conv_filter(self, name):
        init = tf.constant_initializer(value=weights,
                                       dtype=tf.float32)
        shape = weights.shape
        var = tf.get_variable(name="filter", initializer=init, shape=shape)
        return var

我检查了Caffe net和tensorflow张量gray的输入数据，它们是具有相同2D布局的相同数字。 (1,1,128,128)和(10, 128, 128, 1)，tensorflow使用批量大小为10。

我还通过Caffe print(net.blobs['conv1'].data[0,0,...])和numpy array检查了内核，tensorflow var print(weights[:,:,:,0]) -0.65039569。

内核的第一层屏幕截图如下所示：

偏差为0.30989584 0.30989584 0.29427084 0.21354167 0.16145833 0.30989584 0.30989584 0.29427084 0.21354167 0.16145833 0.28645834 0.28645834 0.27083334 0.19010417 0.09114584，图片的左上角为：

conv1

但是，Caffe的第一个要素图的左上角是不同。（请忽略不相关的256）

只有最左边的列是一致的。我手动计算并检查结果，-0.71238005 -0.74042225＆＃39; s（tensorflow）的第一个和第二个值根据卷积的定义是正确的，-0.71238005 -0.31195271中的第二个值＆＃39; s（tensorflow）不正确。

考虑到填充，第一个值来自图像的3x3块，第二个值应该是3x4块。

由于tensorflow具有正确的第一个值，根据图像角的3x3块计算，我假设内核布局和图像布局以及“相同”。填充是正确的。我认为这是一个问题，步幅导致不正确的第二个值，但步幅必须是1，否则conv1 (10, 128, 128, 96)功能图的尺寸不会超过是input_param { shape: { dim: 10 dim: 1 dim: 128 dim: 128 } } transform_param { crop_size: 128 mirror: false } } layer{ name: "conv1" type: "Convolution" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 5 stride: 1 pad: 2 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } bottom: "data" top: "conv1" }。

Caffe的卷积层def：

tensorflow

更新：

另一个人为的实验（参见代码bolow）显示input = np.random.rand(100,100) input = input.reshape([1,100,100,1]) k = np.random.rand(5,5) k = k.reshape([5,5,1,1]) input_tf = tf.constant(input,dtype=tf.float32) init = tf.constant_initializer(value=k, dtype=tf.float32) filter = tf.get_variable(name="filter", initializer=init, shape=k.shape) conv = tf.nn.conv2d(input_tf, filter, strides=[1,1,1,1], padding='SAME')实现能够计算正确的第二个值。但是，错误仍然存在于上述情况中。是什么导致了转换版本中的错误？

master/branchA

tensorflow conv2d意外卷积结果

0 个答案: