为什么我的自定义流量度量标准在相同输入上运行时始终会给出不同的结果?

时间:2017-10-25 13:44:15

标签: python tensorflow

我正在尝试学习如何在Tensorflow中创建自己的自定义流量指标。

我开始尝试编写自己的函数来计算f1分数。

这是我到目前为止所拥有的:

import tensorflow as tf
import numpy as np
from sklearn.metrics import precision_recall_fscore_support, f1_score, precision_score

sess = tf.InteractiveSession()

# Custom streaming metric to compute f1 score.
# Code is from answer to https://stackoverflow.com/questions/44764688/custom-metric-based-on-tensorflows-streaming-metrics-returns-nan/44935895
def metric_fn(predictions=None, labels=None, weights=None):
    P, update_op1 = tf.contrib.metrics.streaming_precision(predictions, labels)
    R, update_op2 = tf.contrib.metrics.streaming_recall(predictions, labels)
    eps = 1e-5;
    return (2*(P*R)/(P+R+eps), tf.group(update_op1, update_op2))


# True labels
labels = np.array([1, 0, 0, 1])
# Predicted labels
preds = np.array([1, 1, 0, 1])

f1 = metric_fn(preds, labels)

init1 = tf.global_variables_initializer()
init2 = tf.local_variables_initializer()
sess.run([init1, init2])

# Check result with output from sklearn
print(f1_score(labels, preds))

# Run a custom metric a few times
print(sess.run(f1))
print(sess.run(f1))
print(sess.run(f1))

这是我得到的输出:

0.8
(0.0, None)
(0.99999624, None)
(0.79999518, None)

第一行是使用sklearn的f1_score函数计算的f1分数,这是正确的。其余的来自metric_fn

我不理解metric_fn的输出。为什么metric_fn的结果总是会改变,即使我给它相同的输出?而且,即使我编码的公式是正确的,其结果也都不正确。发生了什么以及我需要做出哪些改变才能获得正确的结果?

1 个答案:

答案 0 :(得分:1)

您可以通过以下方式将metric_fn的输出分成两部分:

f1_value, update_op = metric_fn(preds, labels)

其中f1_value是您得分的当前值& update_op是采用preds和标签的新值并更新f1分数的操作。

因此,在这种情况下,您可以通过以下方式更改代码:

import tensorflow as tf
import numpy as np
from sklearn.metrics import precision_recall_fscore_support, f1_score, precision_score

sess = tf.InteractiveSession()

# Custom streaming metric to compute f1 score.
# Code is from answer to https://stackoverflow.com/questions/44764688/custom-metric-based-on-tensorflows-streaming-metrics-returns-nan/44935895
def metric_fn(predictions=None, labels=None, weights=None):
    P, update_op1 = tf.contrib.metrics.streaming_precision(predictions, labels)
    R, update_op2 = tf.contrib.metrics.streaming_recall(predictions, labels)
    eps = 1e-5;
    return (2*(P*R)/(P+R+eps), tf.group(update_op1, update_op2))


# True labels
labels = np.array([1, 0, 0, 1])
# Predicted labels
preds = np.array([1, 1, 0, 1])

f1_value, update_op = metric_fn(preds, labels)

init1 = tf.global_variables_initializer()
init2 = tf.local_variables_initializer()
sess.run([init1, init2])

# Check result with output from sklearn
print(f1_score(labels, preds))

# Run a custom metric a few times
print(sess.run(f1_value))
print(sess.run(update_op))
print(sess.run(f1_value))

你得到的,正如所料:

0.8 # Obtained with sklearn
0.0 # Value of f1_value before calling update_op
None # update_op does not return anything
0.799995 # Value of f1_value after calling update_op

请注意,update_op仅返回None,因为使用tf.group创建的操作系统没有输出。每个update_op1update_op2将分别返回1.0和0.6666667。