我对修改后的U-Net架构的输入维度有一个或两个问题。为了节省您的时间并更好地理解/重现我的结果,我将发布代码和输出尺寸。修改后的U-Net体系结构是https://github.com/nibtehaz/MultiResUNet/blob/master/MultiResUNet.py中的MultiResUNet体系结构。并且基于本文https://arxiv.org/abs/1902.04049,请不要因为这段代码的长度而被关闭。您可以简单地将其复制粘贴,并且重现时间不会超过10秒。同样,您不需要数据集。已使用TF.v1.9 Keras v.2.20测试。
import tensorflow as tf
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Conv2DTranspose, concatenate, BatchNormalization, Activation, add
from tensorflow.keras.models import Model
from tensorflow.keras.activations import relu
###{ 2D Convolutional layers
# Arguments: ######################################################################
# x {keras layer} -- input layer #
# filters {int} -- number of filters #
# num_row {int} -- number of rows in filters #
# num_col {int} -- number of columns in filters #
# Keyword Arguments:
# padding {str} -- mode of padding (default: {'same'})
# strides {tuple} -- stride of convolution operation (default: {(1, 1)})
# activation {str} -- activation function (default: {'relu'})
# name {str} -- name of the layer (default: {None})
# Returns:
# [keras layer] -- [output layer]}
# # ############################################################################
def conv2d_bn(x, filters ,num_row,num_col, padding = "same", strides = (1,1), activation = 'relu', name = None):
x = Conv2D(filters,(num_row, num_col), strides=strides, padding=padding, use_bias=False)(x)
x = BatchNormalization(axis=3, scale=False)(x)
if(activation == None):
return x
x = Activation(activation, name=name)(x)
return x
# our 2D transposed Convolution with batch normalization
# 2D Transposed Convolutional layers
# Arguments: #############################################################
# x {keras layer} -- input layer #
# filters {int} -- number of filters #
# num_row {int} -- number of rows in filters #
# num_col {int} -- number of columns in filters
# Keyword Arguments:
# padding {str} -- mode of padding (default: {'same'})
# strides {tuple} -- stride of convolution operation (default: {(2, 2)})
# name {str} -- name of the layer (default: {None})
# Returns:
# [keras layer] -- [output layer] ###################################
def trans_conv2d_bn(x, filters, num_row, num_col, padding='same', strides=(2, 2), name=None):
x = Conv2DTranspose(filters, (num_row, num_col), strides=strides, padding=padding)(x)
x = BatchNormalization(axis=3, scale=False)(x)
return x
# Our Multi-Res Block
# Arguments: ############################################################
# U {int} -- Number of filters in a corrsponding UNet stage #
# inp {keras layer} -- input layer #
# Returns: #
# [keras layer] -- [output layer] #
###################################################################
def MultiResBlock(U, inp, alpha = 1.67):
W = alpha * U
shortcut = inp
shortcut = conv2d_bn(shortcut, int(W*0.167) + int(W*0.333) +
int(W*0.5), 1, 1, activation=None, padding='same')
conv3x3 = conv2d_bn(inp, int(W*0.167), 3, 3,
activation='relu', padding='same')
conv5x5 = conv2d_bn(conv3x3, int(W*0.333), 3, 3,
activation='relu', padding='same')
conv7x7 = conv2d_bn(conv5x5, int(W*0.5), 3, 3,
activation='relu', padding='same')
out = concatenate([conv3x3, conv5x5, conv7x7], axis=3)
out = BatchNormalization(axis=3)(out)
out = add([shortcut, out])
out = Activation('relu')(out)
out = BatchNormalization(axis=3)(out)
return out
# Our ResPath:
# ResPath
# Arguments:#######################################
# filters {int} -- [description]
# length {int} -- length of ResPath
# inp {keras layer} -- input layer
# Returns:
# [keras layer] -- [output layer]#############
def ResPath(filters, length, inp):
shortcut = inp
shortcut = conv2d_bn(shortcut, filters, 1, 1,
activation=None, padding='same')
out = conv2d_bn(inp, filters, 3, 3, activation='relu', padding='same')
out = add([shortcut, out])
out = Activation('relu')(out)
out = BatchNormalization(axis=3)(out)
for i in range(length-1):
shortcut = out
shortcut = conv2d_bn(shortcut, filters, 1, 1,
activation=None, padding='same')
out = conv2d_bn(out, filters, 3, 3, activation='relu', padding='same')
out = add([shortcut, out])
out = Activation('relu')(out)
out = BatchNormalization(axis=3)(out)
return out
# MultiResUNet
# Arguments: ############################################
# height {int} -- height of image
# width {int} -- width of image
# n_channels {int} -- number of channels in image
# Returns:
# [keras model] -- MultiResUNet model###############
def MultiResUnet(height, width, n_channels):
inputs = Input((height, width, n_channels))
# downsampling part begins here
mresblock1 = MultiResBlock(32, inputs)
pool1 = MaxPooling2D(pool_size=(2, 2))(mresblock1)
mresblock1 = ResPath(32, 4, mresblock1)
mresblock2 = MultiResBlock(32*2, pool1)
pool2 = MaxPooling2D(pool_size=(2, 2))(mresblock2)
mresblock2 = ResPath(32*2, 3, mresblock2)
mresblock3 = MultiResBlock(32*4, pool2)
pool3 = MaxPooling2D(pool_size=(2, 2))(mresblock3)
mresblock3 = ResPath(32*4, 2, mresblock3)
mresblock4 = MultiResBlock(32*8, pool3)
# Upsampling part
up5 = concatenate([Conv2DTranspose(
32*4, (2, 2), strides=(2, 2), padding='same')(mresblock4), mresblock3], axis=3)
mresblock5 = MultiResBlock(32*8, up5)
up6 = concatenate([Conv2DTranspose(
32*4, (2, 2), strides=(2, 2), padding='same')(mresblock5), mresblock2], axis=3)
mresblock6 = MultiResBlock(32*4, up6)
up7 = concatenate([Conv2DTranspose(
32*2, (2, 2), strides=(2, 2), padding='same')(mresblock6), mresblock1], axis=3)
mresblock7 = MultiResBlock(32*2, up7)
conv8 = conv2d_bn(mresblock7, 1, 1, 1, activation='sigmoid')
model = Model(inputs=[inputs], outputs=[conv8])
return model
现在回到我的问题,UNet体系结构中输入/输出尺寸不匹配。
如果我选择滤镜高度/宽度(128,128)或(256,256)或(512,512)并执行:
model = MultiResUnet(128, 128,3)
display(model.summary())
Tensorflow为我提供了整个体系结构的完美结果。现在,如果我这样做
model = MultiResUnet(36, 36,3)
display(model.summary())
我收到此错误:
--------------------------------------------------- ---------------------------- ValueError追踪(最近的呼叫 最后) ----> 1个模型= MultiResUnet(36,36,3) 2显示(model.summary())
MultiResUnet中的(高度,宽度, n_channels) 25 26 up5 = concatenate([Conv2DTranspose( ---> 27 32 * 4,(2,2),步幅=(2,2),padding =“相同”)(mresblock4),mresblock3],轴= 3) 28 mresblock5 = MultiResBlock(32 * 8,up5) 29
〜/ miniconda3 / envs / MastersThenv / lib / python3.6 / site-packages / tensorflow / python / keras / layers / merge.py 在串联中(输入,轴,**变形) 682张量,输入与轴
axis
并置。 683“”“ -> 684 return串联(axis = axis,** kwargs)(输入) 685 686〜/ miniconda3 / envs / MastersThenv / lib / python3.6 / site-packages / tensorflow / python / keras / engine / base_layer.py 在通话中(自己,输入,* args,** kwargs) 第694章死了(二更) (695) -> 696个self.build(input_shapes) 697 698#检查在图层构建后设置的输入假设,例如输入形状。
〜/ miniconda3 / envs / MastersThenv / lib / python3.6 / site-packages / tensorflow / python / keras / utils / tf_utils.py 在包装器中(实例,input_shape) 146其他: (147)第147章 -> 148 output_shape = fn(实例,input_shape) 149如果output_shape不为None: 150 if isinstance(output_shape,list):
〜/ miniconda3 / envs / MastersThenv / lib / python3.6 / site-packages / tensorflow / python / keras / layers / merge.py 在构建中(self,input_shape) 388个形状匹配的输入 389',除了concat轴。 ' -> 390个'输入形状:%s'%(input_shape)) 391 392 def _merge_function(自我,输入):
ValueError:
Concatenate
层需要具有匹配形状的输入 除了concat轴。得到了输入形状:[[None,8,8,128), (无,9、9、128))
为什么Conv2DTranspose给我错误的尺寸
(无,8、8、128)
代替
(无,9、9、128)
,当我选择(128,128),(256,256)等(32的倍数)等过滤器尺寸时,为什么Concat函数不会抱怨 因此,为了概括这个问题,我如何使这种UNet架构适用于任何大小的过滤器,如何处理Conv2DTranspose层,产生的输出尺寸(宽度/高度)比实际需要的尺寸小(宽度/高度)维度(当过滤器尺寸不是32的倍数或不对称时),为什么在其他过滤器尺寸是32的倍数时为什么不发生这种情况。如果我有可变输入尺寸强> ??
任何帮助将不胜感激。
欢呼声, 高
答案 0 :(得分:3)
U-Net系列模型(例如上面的MultiResUNet模型)遵循编码器-解码器体系结构。 编码器是具有特征提取功能的下采样路径,而解码器是具有特征采样功能的下采样路径。来自编码器的特征图通过跳过连接在解码器中级联。这些要素图被连接在最后一个轴上,即'通道'轴(考虑要素的尺寸为[batch_size,高度,宽度,通道])。现在,要在任何轴(在我们的示例中为“通道”轴)上连接要素,所有其他轴的尺寸必须匹配。
在上述模型体系结构中,在编码器路径中(通过MaxPooling2D
执行了 3个下采样/最大合并操作)。在解码器路径上,执行 3个上采样/转置转换操作,旨在将图像恢复到完整尺寸。但是,要进行串联(通过跳过连接),在模型的每个“层”上,高度,宽度和batch_size 的降采样和上采样特征尺寸应保持相同。我将用您在问题中提到的示例来说明这一点:
第一种情况:输入尺寸(128,128,3):128-> 64-> 32-> 16-> 32-> 64-> 128
第二种情况:输入尺寸(36,36,3):36-> 18-> 9-> 4-> 8 -> 16-> 32
在第二种情况下,当特征图的 height 和 width 在编码器路径中达到 9 时,进一步的下采样会导致维度更改(损耗)在上采样时无法在解码器中重新获得 。因此,由于无法连接尺寸为 [((None,8,8,128)] 和 [(None,9,9,128)] 的特征图而引发错误。强>。
通常,对于具有' n '个下采样(MaxPooling2D
)层的简单编码器-解码器模型(具有跳过连接),输入尺寸必须为倍数为2 ^ n ,以便能够在解码器处串联模型的编码器功能。在这种情况下, n = 3 ,因此输入必须为 8 的倍数,以免出现这些尺寸不匹配错误。
希望这会有所帮助! :)
答案 1 :(得分:0)
感谢@Balraj Ashwath的出色回答!然后,如果输入的形状为 h ,并且您想使用深度为 d ( h > = 2 ^ d < / strong>),一种可能性是用 delta_h 零填充 h 的尺寸,由以下表达式给出:
import numpy as np
h, d = 36, 3
delta_h = np.ceil(h/(2**d)) * (2**d) - h
print(delta_h)
> 4.0
然后,以@Balraj Ashwath为例:
40 -> 20 -> 10 -> 5 -> 10 -> 20 -> 40