Question

为什么

均方误差的keras代码使用返回均值-> 1 / N->是否为N batchsize？
分类交叉熵的Keras代码使用return reduce_sum-> 1.我记得分类交叉熵也需要除以batchsize。

请解释一下。

这是均方误差代码：

def mean_squared_error(y_true, y_pred):
    return K.mean(K.square(y_pred - y_true), axis=-1)

这是分类交叉熵的代码：

def categorical_crossentropy(target, output, from_logits=False, axis=-1):
"""Categorical crossentropy between an output tensor and a target tensor.
# Arguments
    target: A tensor of the same shape as `output`.
    output: A tensor resulting from a softmax
        (unless `from_logits` is True, in which
        case `output` is expected to be the logits).
    from_logits: Boolean, whether `output` is the
        result of a softmax, or is a tensor of logits.
    axis: Int specifying the channels axis. `axis=-1`
        corresponds to data format `channels_last`,
        and `axis=1` corresponds to data format
        `channels_first`.
# Returns
    Output tensor.
# Raises
    ValueError: if `axis` is neither -1 nor one of
        the axes of `output`.
"""
output_dimensions = list(range(len(output.get_shape())))
if axis != -1 and axis not in output_dimensions:
    raise ValueError(
        '{}{}{}'.format(
            'Unexpected channels axis {}. '.format(axis),
            'Expected to be -1 or one of the axes of `output`, ',
            'which has {} dimensions.'.format(len(output.get_shape()))))
# Note: tf.nn.softmax_cross_entropy_with_logits
# expects logits, Keras expects probabilities.
if not from_logits:
    # scale preds so that the class probas of each sample sum to 1
    output /= tf.reduce_sum(output, axis, True)
    # manual computation of crossentropy
    _epsilon = _to_tensor(epsilon(), output.dtype.base_dtype)
    output = tf.clip_by_value(output, _epsilon, 1. - _epsilon)
    return - tf.reduce_sum(target * tf.log(output), axis)
else:
    return tf.nn.softmax_cross_entropy_with_logits(labels=target,
                                                   logits=output)
def categorical_crossentropy(y_true, y_pred):
    return K.categorical_crossentropy(y_true, y_pred)

为什么均方误差的keras代码返回K.mean和分类交叉熵返回reduce_sum

0 个答案: