Question

我想在tensorflow中使用conv2d，代码如下：

import tensorflow as tf
filter_num = 3
filter_size = 5
char_embed_size = 300
max_word_length = 1
max_seq_length = 15
embeddings = tf.Variable(tf.random_normal([512,14,300]))
#cnn_inputs = tf.reshape(word_embeddings, (-1, max_word_length, char_embed_size, 1))
cnn_inputs = tf.reshape(embeddings, (-1, max_word_length, char_embed_size, 1))
filter_shape = [filter_size, char_embed_size, 1, filter_num]
w = tf.Variable(tf.truncated_normal(filter_shape, stddev=0.1), name="cnn_w")
b = tf.Variable(tf.constant(0.1, shape=[filter_num]), name="cnn_b")
conv = tf.nn.conv2d(
    cnn_inputs,
    w,
    strides=[1, 1, 1, 1],
    padding="VALID")

当我运行它时，会发生如下错误：

ValueError：从1中减去5导致的负尺寸大小对于输入形状为[7168,1,300,1]的“ Conv2D”（操作：“ Conv2D”）， [5,300,1,3]

输入形状似乎不匹配，如何解决该问题？

Answer 1

TL; DR答案。

使用padding =“ SAME”：

conv = tf.nn.conv2d(
    cnn_inputs,
    w,
    strides=[1, 1, 1, 1],
    padding="SAME") # old value is padding="VALID"

详细的答案。

根据TF文档，输入张量（cnn_inputs）的形状应为[batch, in_height, in_width, in_channels]，内核张量（示例中的w）的形状应为[filter_height, filter_width, in_channels, out_channels]

在您的示例中：

cnn_input.shape是[7168, 1, 300, 1]，因此in_height == 1和in_width = 300
w.shape是[5, 300, 1, 3]，因此filter_height == 5和filter_width == 300

如果padding="VALID"和stride=[1, 1, 1, 1]，那么conv2d操作将通过在空间维度上减去filter_size在空间维度上“缩小”输入张量。例如，如果in_height == 20和filter_height == 4，则输出张量高度可能为20-4 =16。在具有in_height == 1和filter_height == 5的样本中，输出张量的形状沿高度尺寸大约为in_height - filter_height = 1 - 5 = -4，即您收到的张量具有负高度，这是不可能的，并且会导致错误。

使用padding="SAME"的conv2d操作尝试通过添加零值来保留空间尺寸（该过程称为“零填充”）。因此，输出张量的高度与in_height

相同

您可以在这里找到padding的更多详细说明：What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?

在张量流中使用conv2d

1 个答案: