Question

我一直在TensorFlow的卷积网this example进行编码，我对这种权重分配感到困惑：

weights = {

# 5x5 conv, 1 input, 32 outputs
'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),

# 5x5 conv, 32 inputs, 64 outputs
'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])), 

# fully connected, 7*7*64 inputs, 1024 outputs
'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])), 

# 1024 inputs, 10 outputs (class prediction)
'out': tf.Variable(tf.random_normal([1024, n_classes])) 

}

我们如何知道＆＃39; wd1＆＃39;权重矩阵应该有7 x 7 x 64行？

它后来用于重塑第二个卷积层的输出：

# Fully connected layer
# Reshape conv2 output to fit dense layer input
dense1 = tf.reshape(conv2, [-1, _weights['wd1'].get_shape().as_list()[0]]) 

# Relu activation
dense1 = tf.nn.relu(tf.add(tf.matmul(dense1, _weights['wd1']), _biases['bd1']))

根据我的数学，汇集第2层（conv2输出）有4 x 4 x 64个神经元。

我们为什么要重塑为[-1,7 * 7 * 64]？

Answer 1

从头开始工作：

输入_X的大小为[28x28x1]（忽略批量维度）。一个28x28的灰度图像。

第一个卷积层使用PADDING=same，因此它输出一个28x28图层，然后传递给max_pool k=2，这会将每个维度减少两倍，从而产生采用14x14空间布局。 conv1有32个输出 - 所以完整的每个示例张量现在是[14x14x32]。

在conv2中重复此操作，其中包含64个输出，结果为[7x7x64]。

tl; dr：图像以28x28开始，每个maxpool在每个维度中将其减少两倍。 28/2/2 = 7。

Answer 2

这个问题要求你对深度学习卷积有很好的理解。

基本上，模型的每个卷积层都会减少卷积金字塔横向区域。这种减少是通过卷积步幅和 max_pooling步幅来实现的。更复杂的是，我们有两个基于PADDING的选项。

选项1 - PADDING='SAME'

out_height = ceil(float(in_height) / float(strides[1]))
out_width  = ceil(float(in_width) / float(strides[2]))

选项2 - PADDING='VALID'

out_height = ceil(float(in_height - filter_height + 1) / float(strides[1]))
out_width  = ceil(float(in_width - filter_width + 1) / float(strides[2]))

对于 EACH 卷积和最大合并通话，您必须计算新的out_height和out_width。然后，在卷积结束时，您将out_height，out_width和最后一个卷积层的深度相乘。此乘法的结果是输出要素图大小，它是第一个完全连接的图层的输入。

因此，在您的示例中，您可能只有PADDING='SAME'，卷积步幅为1，最大合并步幅为2，两次。最后你只需将所有东西除以4（1,2,1,2）。

tensorflow API

的更多信息

TensorFlow ConvNet中完全连接的层重量尺寸

2 个答案: