我有一个CNN(特别是UNet)的Keras实现,我通过提供256x256x3(RGB)视网膜图像和相同大小的随附图像蒙版作为输入进行训练:
面具是我的地面真理。蒙版中的每个像素都是10种独特颜色(白色,黑色,蓝色等)之一,该颜色映射到原始视网膜图像中10种生物层之一的位置。
UNet输出为256x256x3图像,其中每个像素的颜色值应与图像蒙版中的相应颜色相同。但是,我希望输出是一个256x256x10的数组,其中每个像素都具有在该像素处占据该位置的10种颜色之一的概率(0.0到1.0)。
这是我的Unet的代码:
# --------------------------------------------------------------------------------------
# CONV 2D BLOCK
# --------------------------------------------------------------------------------------
def conv2d_block(input_tensor, n_filters, kernel_size = 3, batchnorm = True):
"""Function to add 2 convolutional layers with the parameters passed to it"""
# first layer
x = Conv2D(filters = n_filters, kernel_size = kernel_size, data_format="channels_last", \
kernel_initializer = 'he_normal', padding = 'same')(input_tensor)
if batchnorm:
x = BatchNormalization()(x)
x = Activation('relu')(x)
# second layer
x = Conv2D(filters = n_filters, kernel_size = kernel_size, data_format="channels_last", \
kernel_initializer = 'he_normal', padding = 'same')(input_tensor)
if batchnorm:
x = BatchNormalization()(x)
x = Activation('relu')(x)
return x
# --------------------------------------------------------------------------------------
# GET THE U-NET ARCHITECTURE
# --------------------------------------------------------------------------------------
def get_unet(input_img, n_filters = 16, dropout = 0.1, batchnorm = True):
# Contracting Path (256 x 256 x 3)
c1 = conv2d_block(input_img, n_filters * 1, kernel_size = (3, 3), batchnorm = batchnorm)
p1 = MaxPooling2D((2, 2))(c1)
p1 = Dropout(dropout)(p1)
c2 = conv2d_block(p1, n_filters * 2, kernel_size = (3, 3), batchnorm = batchnorm)
p2 = MaxPooling2D((2, 2))(c2)
p2 = Dropout(dropout)(p2)
c3 = conv2d_block(p2, n_filters * 4, kernel_size = (3, 3), batchnorm = batchnorm)
p3 = MaxPooling2D((2, 2))(c3)
p3 = Dropout(dropout)(p3)
c4 = conv2d_block(p3, n_filters * 8, kernel_size = (3, 3), batchnorm = batchnorm)
p4 = MaxPooling2D((2, 2))(c4)
p4 = Dropout(dropout)(p4)
c5 = conv2d_block(p4, n_filters = n_filters * 16, kernel_size = (3, 3), batchnorm = batchnorm)
# Expansive Path
u6 = Conv2DTranspose(n_filters * 8, 3, strides = (2, 2), padding = 'same')(c5)
u6 = concatenate([u6, c4])
u6 = Dropout(dropout)(u6)
c6 = conv2d_block(u6, n_filters * 8, kernel_size = 3, batchnorm = batchnorm)
u7 = Conv2DTranspose(n_filters * 4, 3, strides = (2, 2), padding = 'same')(c6)
u7 = concatenate([u7, c3])
u7 = Dropout(dropout)(u7)
c7 = conv2d_block(u7, n_filters * 4, kernel_size = 3, batchnorm = batchnorm)
u8 = Conv2DTranspose(n_filters * 2, 3, strides = (2, 2), padding = 'same')(c7)
u8 = concatenate([u8, c2])
u8 = Dropout(dropout)(u8)
c8 = conv2d_block(u8, n_filters * 2, kernel_size = 3, batchnorm = batchnorm)
u9 = Conv2DTranspose(n_filters * 1, 3, strides = (2, 2), padding = 'same')(c8)
u9 = concatenate([u9, c1])
u9 = Dropout(dropout)(u9)
c9 = conv2d_block(u9, n_filters * 1, kernel_size = 3, batchnorm = batchnorm)
outputs = Conv2D(3, 1, activation='sigmoid')(c9)
model = Model(inputs=[input_img], outputs=[outputs])
return model
我的问题是如何更改网络的设计,使其采用相同的输入,但对相应输入图像和蒙版的每个像素生成256x256x10的预测?
答案 0 :(得分:1)
outputs = Conv2D(10, 1, activation='softmax')(c9)
outputs
是形状为[?,256,256,10]
的张量,并且沿最后一个轴(axis=-1)
进行softmax激活。