Question

我最近开始学习语义分割。我正尝试为它训练一个UNet。我的输入是RGB 128x128x3图像。我的遮罩由4个0、1、2、3类组成，并且是一键编码的，尺寸为128x128x4。

def weighted_cce(y_true, y_pred):
        weights = []
        t_inf = tf.convert_to_tensor(1e9, dtype = 'float32')
        t_zero = tf.convert_to_tensor(0, dtype = 'int64')
        for i in range(0, 4):
            l = tf.argmax(y_true, axis = -1) == i
            n = tf.cast(tf.math.count_nonzero(l), 'float32') + K.epsilon()
            weights.append(n)

        weights = [batch_size/j for j in weights]

        y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
        # clip to prevent NaN's and Inf's
        y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
        # calc
        loss = y_true * K.log(y_pred) * weights
        loss = -K.sum(loss, -1)
        return loss

这是我正在使用的损失函数，但将每个像素归为2。我在做什么错了？

用于语义分割的加权像素明智分类交叉熵

0 个答案: