如何在Keras CNN中对齐尺寸,以便输出匹配自定义损失函数?

时间:2019-04-09 15:11:37

标签: python tensorflow keras deep-learning faster-rcnn

我无法编译此模型。

我正在尝试实现VGG16,但是我将使用自定义损失函数。目标变量的形状为(?, 14, 14, 9, 6),其中我们仅在Y_train[:,:,:,:,0]上使用二进制交叉熵,然后使用Y_train[:,:,:,:,1]作为关闭损耗的开关,从而有效地使其成为一个小批量生产-其他变量将可以在神经网络的单独分支上使用。这是该分支上的一个二进制分类问题,所以我只想输出形状为(?, 14, 14, 9, 1)的输出。

我在下面列出了我的错误。您能否先解释一下出了什么问题,其次才能解决这个问题?

型号代码

img_input = Input(shape = (224,224,3))

x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

# # Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

# # Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

# # Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)

x = Conv2D(512, (3, 3), padding='same', activation='relu', kernel_initializer='normal', name='rpn_conv1')(x)

x_class = Conv2D(9, (1, 1), activation='sigmoid', kernel_initializer='uniform', name='rpn_out_class')(x)

x_class = Reshape((14,14,9,1))(x_class)
model = Model(inputs=img_input, outputs=x_class)
model.compile(loss=rpn_loss_cls(), optimizer='adam')

丢失功能代码:

def rpn_loss_cls(lambda_rpn_class=1.0, epsilon = 1e-4):

    def rpn_loss_cls_fixed_num(y_true, y_pred):
        return lambda_rpn_class * K.sum(y_true[:,:,:,:,0] 
                                * K.binary_crossentropy(y_pred[:,:,:,:,:], y_true[:,:,:,:,1]))
                                / K.sum(epsilon + y_true[:,:,:,:,0])
    return rpn_loss_cls_fixed_num

错误:

ValueError: logits and labels must have the same shape ((?, ?, ?, ?) vs (?, 14, 14, 9, 1))

注意:我已经在这个站点上阅读了多个错误,但都存在相同的错误,但是没有一个解决方案允许我的模型进行编译。

可能的解决方案:

我继续搞砸了,发现通过添加

y_true = K.expand_dims(y_true, axis=-1)

我能够编译模型。仍然怀疑这是否可以正常工作。

1 个答案:

答案 0 :(得分:0)

Keras模型集y_true的形状等效于输入形状。因此,当损失函数出现形状不匹配错误时。因此,您需要使用expand_dims对齐尺寸。但是,这需要考虑您的模型架构,数据和损失函数来完成。下面的代码将编译。

def rpn_loss_cls(lambda_rpn_class=1.0, epsilon = 1e-4):

    def rpn_loss_cls_fixed_num(y_true, y_pred):
        y_true = tf.keras.backend.expand_dims(y_true, -1)
        return lambda_rpn_class * K.sum(y_true[:,:,:,:,0] 
                                * K.binary_crossentropy(y_pred[:,:,:,:,:], y_true[:,:,:,:,1]))
                                / K.sum(epsilon + y_true[:,:,:,:,0])
    return rpn_loss_cls_fixed_num