Keras:如何处理语义分割任务中的不平衡类?

时间:2018-03-27 17:21:08

标签: deep-learning keras

我预先训练了基于VGG16的FCN-32,就像模型一样,定义如下:

def pop_layer(model):
    if not model.outputs:
        raise Exception('Sequential model cannot be popped: model is empty.')

    model.layers.pop()
    if not model.layers:
        model.outputs = []
        model.inbound_nodes = []
        model.outbound_nodes = []
    else:
        model.layers[-1].outbound_nodes = []
        model.outputs = [model.layers[-1].output]
    model.built = False

def get_model():
    #Fully convolutional part of VGG16
    model = VGG16(include_top=False, weights='imagenet')

    #Remove last max pooling layer
    pop_layer(model)

    #Freeze pretrained layers
    for layer in model.layers:
        layer.trainable = False

    model = Model(inputs=model.inputs, outputs=model.outputs)

    #print('len(model.layers)', len(model.layers)) #
    #print(model.summary()) #

    x = Conv2D(512, (3, 3), activation='relu', padding='same')(model.output)
    x = Conv2DTranspose(NUMBER_OF_CLASSES, kernel_size=(32, 32), strides=(16, 16), activation='sigmoid', padding='same')(x)
    head = Reshape((-1,NUMBER_OF_CLASSES))(x)

    model = Model(inputs=model.inputs, outputs=head)

    model.compile(optimizer=Adadelta(), loss='binary_crossentropy')

    print('len(model.layers)', len(model.layers)) #
    print(model.summary()) #

    return model

模型摘要:

len(model.layers) 21
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, None, None, 3)     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, None, None, 256)   295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, None, None, 256)   590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, None, None, 256)   0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, None, None, 512)   1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, None, None, 512)   0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, None, None, 512)   2359808   
_________________________________________________________________
conv2d_1 (Conv2D)            (None, None, None, 512)   2359808   
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, None, None, 3)     1572867   
_________________________________________________________________
reshape_1 (Reshape)          (None, None, 3)           0         
=================================================================
Total params: 18,647,363
Trainable params: 3,932,675
Non-trainable params: 14,714,688
_________________________________________________________________
None

但是当我训练模型时它只能预测最主要的类,我的数据集是不平衡的:

Pixel area per class ratio:
class1 : 62.93 %
class2 : 25.46 %
class3 : 11.61 %

所以我的问题是:我的模型定义好吗?如何应对班级失衡?也许批量应该以某种特殊方式构建?

2 个答案:

答案 0 :(得分:1)

看起来你的损失并不适合你的问题。你在这里使用二进制交叉熵损失:

model.compile(optimizer=Adadelta(), loss='binary_crossentropy')

但是你有两个以上的课程。因此,我建议您使用categorical_crossentropy损失(显示在损失列表中here。请在页面底部阅读如何准备数据以使用此损失。)

还有其他类型的损失可能适合更好的失衡类别情况。您可以尝试使用骰子丢失,这是IoU的差分近似(交叉结合)。

第6页第3节here描述了这种损失。

答案 1 :(得分:0)

一种常见的方法是对损失函数使用类加权,因此您可以惩罚突出类的效果。 weight_class = 1 / ln(c + class_probability) 其中c为常数,常用值为1.03。