我有一个二进制分类问题,类别背景(bg)= 0,信号(sig)= 1,我正在训练NN。出于监控目的,我正在尝试使用TensorFlow后端在Keras中实现自定义指标,该指标执行以下操作:
1)计算我的NN输出的阈值,这将导致X的假阳性率(将bg分类为信号)(在这种情况下X = 0.02,但它可以是任何东西)。
2)计算此阈值的真实阳性率。
给定numpy数组y_true,y_pred,我会写一个像:
这样的函数def eff_at_2percent_metric(y_true, y_pred):
#Find list of bg events
bg_list = np.argwhere(y_true < 0.5)
#Order by the NN output
ordered_bg_predictions = np.flip(np.sort(y_pred[bg_list]),axis=0)
#Find the threshold with 2% false positive rate
threshold = ordered_bg_predictions[0.02*round(len(ordered_bg_list))]
#Find list of signal events
sig_list = np.argwhere(y_true > 0.5)
#Order these by NN output
ordered_sig_predictions = np.sort(y_pred[sig_list])
#Find true positive rate with this threshold
sig_eff = 1 - np.searchsorted(ordered_sig_predictions,threshold)/len(ordered_sig_predictions)
return sig_eff
当然,这不起作用,因为要实现自定义指标,y_true和y_pred应该是TensorFlow张量而不是numpy数组。有什么方法可以让我的工作正常吗?
答案 0 :(得分:2)
sensitivity at specificity有一个指标,我相信它是等价的(特异性是一个减去FPR)。
答案 1 :(得分:0)
您可以实施自己的指标,以下是误报率的示例:
from tensorflow.python.eager import context
from tensorflow.python.framework import dtypes
from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import math_ops
from tensorflow.python.ops import variable_scope
from tensorflow.python.ops.metrics_impl import _aggregate_across_towers
from tensorflow.python.ops.metrics_impl import true_negatives
from tensorflow.python.ops.metrics_impl import false_positives
from tensorflow.python.ops.metrics_impl import _remove_squeezable_dimensions
def false_positive_rate(labels,
predictions,
weights=None,
metrics_collections=None,
updates_collections=None,
name=None):
if context.executing_eagerly():
raise RuntimeError('tf.metrics.recall is not supported is not '
'supported when eager execution is enabled.')
with variable_scope.variable_scope(name, 'false_alarm',
(predictions, labels, weights)):
predictions, labels, weights = _remove_squeezable_dimensions(
predictions=math_ops.cast(predictions, dtype=dtypes.bool),
labels=math_ops.cast(labels, dtype=dtypes.bool),
weights=weights)
false_p, false_positives_update_op = false_positives(
labels,
predictions,
weights,
metrics_collections=None,
updates_collections=None,
name=None)
true_n, true_negatives_update_op = true_negatives(
labels,
predictions,
weights,
metrics_collections=None,
updates_collections=None,
name=None)
def compute_false_positive_rate(true_n, false_p, name):
return array_ops.where(
math_ops.greater(true_n + false_p, 0),
math_ops.div(false_p, true_n + false_p), 0, name)
def once_across_towers(_, true_n, false_p):
return compute_false_positive_rate(true_n, false_p, 'value')
false_positive_rate = _aggregate_across_towers(
metrics_collections, once_across_towers, true_n, false_p)
update_op = compute_false_positive_rate(true_negatives_update_op,
false_positives_update_op, 'update_op')
if updates_collections:
ops.add_to_collections(updates_collections, update_op)
return false_positive_rate, update_op
您可以调整代码以适应真阳性率。