Question

我正在尝试用三个类别来实现分类问题：“ A”，“ B”和“ C”，我想在其中将针对不同类型错误分类的罚款纳入我的模型损失函数中（类似于加权交叉熵）。类权重不适用，因为它适用于属于该类的所有数据。例如，与被错误分类为“ A”相比，被错误分类为“ C”的真实标签“ B”应具有更高的损失。体重表如下：

   A  B  C  
A  1  1  1  
B  1  1  1.2 
C  1  1  1

在当前categorical_crossentropy损失中，如果我的预测softmax为

0.5 0.4 0.1  vs 0.1 0.4 0.5

categorical_crossentropy将相同。不管“ B”是否被误分类为A或C，都没有关系。与第一个预测相比，我想增加第二个预测softmax的损失。

我尝试过https://github.com/keras-team/keras/issues/2115，但是所有代码都不适用于Keras v2。我可以直接将权重矩阵强制为Keras损失函数的任何帮助将受到高度赞赏。

Answer 1

您可以更改损失函数，以将损失值乘以矩阵中的适当权重。

因此，举例来说，考虑一下mnist tensorflow example：

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

如果我们要更改此值以便根据以下矩阵加权损失：

weights  = tf.constant([
       [1., 1.2, 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1.2, 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 10.9, 1.2, 1., 1., 1., 1., 1., 1.],
       [1., 0.9, 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

然后，我们可以将现有的sparse_categorical_crossentropy包装在新的自定义损失函数中，该函数将损失乘以适当的权重。像这样：

def custom_loss(y_true, y_pred):
  # get the prediction from the final softmax layer:
  pred_idx = tf.argmax(y_pred, axis=1, output_type=tf.int32)

  # stack these so we have a tensor of [[predicted_i, actual_i], ...,] for each i in batch
  indices = tf.stack([tf.reshape(pred_idx, (-1,)), 
                       tf.reshape(tf.cast( y_true, tf.int32), (-1,))
                     ], axis=1)

  # use tf.gather_nd() to convert indices to the appropriate weight from our matrix [w_i, ...] for each i in batch
  batch_weights = tf.gather_nd(weights, indices)


  return batch_weights * tf.keras.losses.sparse_categorical_crossentropy(y_true, y_pred)

然后我们可以在模型中使用此新的自定义损失函数：

model.compile(optimizer='adam',
              loss=custom_loss,
              metrics=['accuracy'])

Answer 2

基于问题＃2115，我编写了以下解决方案和posted it there too。
我仅在Tensorflow 1.14中对其进行了测试，因此我认为它应该与Keras v2一起使用。

在#2115 (comment)的此处添加到class解决方案中这是更健壮和矢量化的实现：

import tensorflow.keras.backend as K
from tensorflow.keras.losses import CategoricalCrossentropy


class WeightedCategoricalCrossentropy(CategoricalCrossentropy):

    def __init__(self, cost_mat, name='weighted_categorical_crossentropy', **kwargs):
        assert(cost_mat.ndim == 2)
        assert(cost_mat.shape[0] == cost_mat.shape[1])

        super().__init__(name=name, **kwargs)
        self.cost_mat = K.cast_to_floatx(cost_mat)

    def __call__(self, y_true, y_pred):

        return super().__call__(
            y_true=y_true,
            y_pred=y_pred,
            sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
        )


def get_sample_weights(y_true, y_pred, cost_m):
    num_classes = len(cost_m)

    y_pred.shape.assert_has_rank(2)
    y_pred.shape[1].assert_is_compatible_with(num_classes)
    y_pred.shape.assert_is_compatible_with(y_true.shape)

    y_pred = K.one_hot(K.argmax(y_pred), num_classes)

    y_true_nk1 = K.expand_dims(y_true, 2)
    y_pred_n1k = K.expand_dims(y_pred, 1)
    cost_m_1kk = K.expand_dims(cost_m, 0)

    sample_weights_nkk = cost_m_1kk * y_true_nk1 * y_pred_n1k
    sample_weights_n = K.sum(sample_weights_nkk, axis=[1, 2])

    return sample_weights_n

用法：

model.compile(loss=WeightedCategoricalCrossentropy(cost_matrix), ...)

类似地，这也可以应用于CategoricalAccuracy指标：

from tensorflow.keras.metrics import CategoricalAccuracy


class WeightedCategoricalAccuracy(CategoricalAccuracy):

    def __init__(self, cost_mat, name='weighted_categorical_accuracy', **kwargs):
        assert(cost_mat.ndim == 2)
        assert(cost_mat.shape[0] == cost_mat.shape[1])

        super().__init__(name=name, **kwargs)
        self.cost_mat = K.cast_to_floatx(cost_mat)

    def update_state(self, y_true, y_pred, sample_weight=None):

        return super().update_state(
            y_true=y_true,
            y_pred=y_pred,
            sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
        )

用法：

model.compile(metrics=[WeightedCategoricalAccuracy(cost_matrix), ...], ...)

Keras对不同的错误分类使用不同的权重

2 个答案: