
时间:2016-02-02 14:07:36

标签: classification tensorflow

我正在尝试将深度学习应用于目标类(500k,31K)之间的高级别不平衡的二进制分类问题。我想写一个自定义丢失函数,应该是这样的: 最小化(100 - ((predicted_smallerclass)/(total_smallerclass))* 100)


7 个答案:

答案 0 :(得分:43)

You can add class weights to the loss function, by multiplying logits. Regular cross entropy loss is this:

loss(x, class) = -log(exp(x[class]) / (\sum_j exp(x[j])))
               = -x[class] + log(\sum_j exp(x[j]))

in weighted case:

loss(x, class) = weights[class] * -x[class] + log(\sum_j exp(weights[class] * x[j]))

So by multiplying logits, you are re-scaling predictions of each class by its class weight.

For example:

ratio = 31.0 / (500.0 + 31.0)
class_weight = tf.constant([ratio, 1.0 - ratio])
logits = ... # shape [batch_size, 2]
weighted_logits = tf.mul(logits, class_weight) # shape [batch_size, 2]
xent = tf.nn.softmax_cross_entropy_with_logits(
  weighted_logits, labels, name="xent_raw")

There is a standard losses function now that supports weights per batch:

tf.losses.sparse_softmax_cross_entropy(labels=label, logits=logits, weights=weights)

Where weights should be transformed from class weights to a weight per example (with shape [batch_size]). See documentation here.

答案 1 :(得分:40)

你提出的代码对我来说似乎不对。 我同意,损失应该乘以重量。


weights[class] * -x[class] + log( \sum_j exp(x[j] * weights[class]) )


weights[class] * log(\sum_j exp(x[j]))


log( (\sum_j exp(x[j]) ^ weights[class] )


ratio = 31.0 / (500.0 + 31.0)
class_weight = tf.constant([[ratio, 1.0 - ratio]])
logits = ... # shape [batch_size, 2]

weight_per_label = tf.transpose( tf.matmul(labels
                           , tf.transpose(class_weight)) ) #shape [1, batch_size]
# this is the weight for each datapoint, depending on its label

xent = tf.mul(weight_per_label
         , tf.nn.softmax_cross_entropy_with_logits(logits, labels, name="xent_raw") #shape [1, batch_size]
loss = tf.reduce_mean(xent) #shape 1

答案 2 :(得分:11)

使用tf.nn.weighted_cross_entropy_with_logits()并将pos_weight设置为1 /(预期的积极比率)。

答案 3 :(得分:4)

您可以在tensorflow https://www.tensorflow.org/api_guides/python/contrib.losses




inputs, labels = LoadData(batch_size=3)
logits = MyModelPredictions(inputs)

# Ensures that the loss for examples whose ground truth class is `3` is 5x
# higher than the loss for all other examples.
weight = tf.multiply(4, tf.cast(tf.equal(labels, 3), tf.float32)) + 1

onehot_labels = tf.one_hot(labels, num_classes=5)
tf.contrib.losses.softmax_cross_entropy(logits, onehot_labels, weight=weight)

答案 4 :(得分:3)



from sklearn.utils.class_weight import compute_sample_weight
#use class weights for handling unbalanced dataset
if mode == 'INFER' #test/dev mode, not weighing loss in test mode
   sample_weights = np.ones(labels.shape)
   sample_weights = compute_sample_weight(class_weight='balanced', y=labels)


#an extra placeholder for sample weights
#assuming you already have batch_size tensor
self.sample_weight = tf.placeholder(dtype=tf.float32, shape=[None],
cross_entropy_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(
                       labels=self.label, logits=logits, 
cross_entropy_loss = tf.reduce_sum(cross_entropy_loss*self.sample_weight) / batch_size

答案 5 :(得分:1)

ops tf.nn.weighted_cross_entropy_with_logits()是否有两个类:

classes_weights = tf.constant([0.1, 1.0])
cross_entropy = tf.nn.weighted_cross_entropy_with_logits(logits=logits, targets=labels, pos_weight=classes_weights)

答案 6 :(得分:0)

org.springframework.beans.factory.BeanCreationException: Error creating bean with name 
'org.dspace.storage.bitstore.BitstreamStorageService' defined in file 
[/dspace/config/spring/api/bitstore.xml]: Cannot resolve reference to bean 's3Store' while 
setting bean property 'stores' with key [TypedStringValue: value [0], target type [null]]; 
nested exception is org.springframework.beans.factory.BeanCreationException: Error 
creating bean with name 's3Store' defined in file 
[/dspace/config/spring/api/bitstore.xml]: Error setting property values; nested exception 
is org.springframework.beans.NotWritablePropertyException: Invalid property 
's3ConnectionTTL' of bean class [org.dspace.storage.bitstore.S3BitStoreService]: Bean 
property 's3ConnectionTTL' is not writable or has an invalid setter method. Does the 
parameter type of the setter match the return type of the getter?


""" Weighted binary crossentropy between an output tensor and a target tensor.
# Arguments
    pos_weight: A coefficient to use on the positive examples.
# Returns
    A loss function supposed to be used in model.compile().
def weighted_binary_crossentropy(pos_weight=1):
    def _to_tensor(x, dtype):
        """Convert the input `x` to a tensor of type `dtype`.
        # Arguments
            x: An object to be converted (numpy array, list, tensors).
            dtype: The destination type.
        # Returns
            A tensor.
        return tf.convert_to_tensor(x, dtype=dtype)
    def _calculate_weighted_binary_crossentropy(target, output, from_logits=False):
        """Calculate weighted binary crossentropy between an output tensor and a target tensor.
        # Arguments
            target: A tensor with the same shape as `output`.
            output: A tensor.
            from_logits: Whether `output` is expected to be a logits tensor.
                By default, we consider that `output`
                encodes a probability distribution.
        # Returns
            A tensor.
        # Note: tf.nn.sigmoid_cross_entropy_with_logits
        # expects logits, Keras expects probabilities.
        if not from_logits:
            # transform back to logits
            _epsilon = _to_tensor(K.epsilon(), output.dtype.base_dtype)
            output = tf.clip_by_value(output, _epsilon, 1 - _epsilon)
            output = log(output / (1 - output))
        target = tf.dtypes.cast(target, tf.float32)
        return tf.nn.weighted_cross_entropy_with_logits(labels=target, logits=output, pos_weight=pos_weight)

    def _weighted_binary_crossentropy(y_true, y_pred):
        return K.mean(_calculate_weighted_binary_crossentropy(y_true, y_pred), axis=-1)
    return _weighted_binary_crossentropy