Question

我正在使用Keras fit_generator使用Keras 2.1.3和Tensorflow 1.8.0训练模型。我定义了一个用于模型检查点的自定义指标：AUPRC（精确召回曲线下的区域）。该指标在编译模型时会添加到指标列表中

def as_keras_metric(method):
    import functools
    from keras import backend as K
    import tensorflow as tf
    @functools.wraps(method)
    def wrapper(self, args, **kwargs):
        """ Wrapper for turning tensorflow metrics into keras metrics """
        value, update_op = method(self, args, **kwargs)
        K.get_session().run(tf.local_variables_initializer())
        with tf.control_dependencies([update_op]):
            value = tf.identity(value)
        return value
    return wrapper

@as_keras_metric
def AUPRC(y_true, y_pred, curve='PR'):
    return tf.metrics.auc(y_true, y_pred, curve=curve,summation_method='careful_interpolation')

在模型编译期间，我将AUPRC添加到指标列表中：

model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy',AUPRC])

在使用fit_generator（）训练模型时；我发现有些奇怪。通过fit_generator度量报告在每个纪元结束后计算的验证AUPRC（在整个Val集上）：纪元1、0.7后，纪元2、0.85后；然后，我终止模型训练；这些数字似乎比我对这个数据集的期望值高得多

然后，我使用受过训练但早已停止的模型对相同的验证集进行了预测；使用模型预测来再次计算AUPRC指标。我现在得到0.45的答案。 与模型训练期间报告的结果相比，训练后的模型以某种方式报告的性能大大降低（甚至比在时期1之后看到的结果小）

似乎在用Keras fit_generator进行训练时； Keras或Tensorflow中存在一些错误，导致报告了错误的值以验证AUPRC。还是在张量流度量标准被用作keras度量标准的方式中存在错误？

注意：人们可能会说AUPRC的各种实现中都有错误，因此我以两种方式检查了这个数字：Scikit Learn（sklearn.metrics.precision_recall_curve的AUC，并将其与使用tensorflow.metrics进行了比较。 .auc（PR曲线）），它们都给我0.45）

使用Keras fit_generator进行训练时，指标报告不正确

0 个答案: