Question

我正在张量流中执行图像分割任务。

代码：

height = 1024
width = 1024
channels = 1

# input place holders
X = tf.placeholder(tf.float32, [None, height, width, channels], name = 'image')

Y = tf.placeholder(tf.float32, [None, height, width, channels], name = 'annotation')

# variable learning rate
lr = tf.placeholder(tf.float32, name = 'lr')




W1 = tf.Variable(tf.truncated_normal([3, 3, 1, 2], stddev=0.1)) 
B1 = tf.Variable(tf.ones([2])/(2*2))

W2 = tf.Variable(tf.truncated_normal([3, 3, 2, 1], stddev=0.1)) 
B2 = tf.Variable(tf.ones([1])/(1*1))

W3 = tf.Variable(tf.truncated_normal([2, 2, 1, 1], stddev=0.1)) 
B3 = tf.Variable(tf.ones([1])/(1*1))




stride = 1  
Y1 = tf.nn.relu(tf.nn.conv2d(X, W1, strides=[1, stride, stride, 1], padding= 'VALID') + B1)

stride = 1  
Y2 = tf.nn.relu(tf.nn.conv2d(Y1, W2, strides=[1, stride, stride, 1], padding='VALID') + B2)

Ylogits = tf.nn.conv2d_transpose(Y2, W3, output_shape = [1, 1024, 1024, 1], strides = [1, 2, 2, 1])

除非我运行此行，否则一切都很好：

train_step = tf.train.GradientDescentOptimizer（learning_rate = lr）.minimize（cross_entropy_sum）

回溯：

train_step = tf.train.GradientDescentOptimizer（learning_rate = LR）.minimize（cross_entropy_sum）

File "/home/anaconda/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 315, in minimize
    grad_loss=grad_loss)
  File "/home/anaconda/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 386, in compute_gradients
    colocate_gradients_with_ops=colocate_gradients_with_ops)
  File "/home/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/gradients_impl.py", line 580, in gradients
    in_grad.set_shape(t_in.get_shape())
  File "/home/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 413, in set_shape
    self._shape = self._shape.merge_with(shape)
  File "/home/anaconda/lib/python2.7/site-packages/tensorflow/python/framework/tensor_shape.py", line 564, in merge_with
    (self, other))
ValueError: Shapes (1, 512, 512, 1) and (?, 1020, 1020, 1) are not compatible

Answer 1

由于conv2d_transpose操作的步幅为[1, 2, 2, 1]，因此输出的高度和宽度将是输出的两倍。有关发生这种情况的可视化，请参见this Data Science answer和page it links here。

你设置了你的output_shape = [1, 1024, 1024, 1]，并且去卷积层需要一半的高度和宽度作为输入，以及相同的批量大小和通道数，这意味着它需要一个[1,512,512,1]的形状]

你给它的输入Y2的形状是[无，1020,1020,1]，因为两个conv2d层每个都会在数据边缘附近略微修剪，但因为步幅是一，它们不会显着减小宽度或高度。批量大小为None，因为它通过这些层保持不变。

要解决此问题，一个选项是在你的一个卷积层上使用[1,2,2,1]的步幅，这将大致将宽度和高度减半（要将它们精确地减半，请使用{{1}这将防止边缘被轻微修剪）。您还必须将解卷积图层的padding='SAME'更改为[无，1024,1024,1]。

在张量流图像分割中运行优化程序步骤时形状不兼容的问题

1 个答案: