Question

我正致力于构建二元分类器：我没有ML经验，所以使用TensorFlow.org上的Iris分类教程改编的代码我在测试集上获得85％的准确率。然而，这个评估是使用0.5的阈值运行的：我希望能够尝试不同的阈值，看看我是否能获得更好的准确性。所以我挖到了tensorflow网站，发现了以下命令：

tf.metrics.precision_at_thresholds(
    labels,
    predictions,
    thresholds,
    weights=None,
    metrics_collections=None,
    updates_collections=None,
    name=None
)

看起来就像我需要的那样，因为它可以让我使用我想要的任何自定义阈值来评估准确度。因此，我将此位添加到我的模型的代码中，最终结果如下所示：

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import numpy as np
import tensorflow as tf
train_file = "/home/javier/train.csv"
test_file = "/home/javier/test.csv"
def main():
    # Load datasets.
    training_set = tf.contrib.learn.datasets.base.load_csv_with_header(
        filename=train_file,
        target_dtype=np.int,
        features_dtype=np.float32)
    test_set = tf.contrib.learn.datasets.base.load_csv_with_header(
        filename=test_file,
        target_dtype=np.int,
        features_dtype=np.float32)
    # Specify that all features have real-value data
    feature_columns = [tf.contrib.layers.real_valued_column("", dimension=15)]
    # Build 3 layer DNN with 10, 20, 10 units respectively.
    classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
            hidden_units=[15,20,15],
            optimizer=tf.train.ProximalAdagradOptimizer(learning_rate=0.05,l2_regularization_strength=0.2),
            n_classes=2,
            model_dir="/home/javier/tf_tinkering")
    # Define the training inputs
    def get_train_inputs():
        x = tf.constant(training_set.data)
        y = tf.constant(training_set.target)
        return x, y
    # Fit model.
    classifier.fit(input_fn=get_train_inputs, steps=50)
    # Define the test inputs
    def get_test_inputs():
        x = tf.constant(test_set.data)
        y = tf.constant(test_set.target)
        return x, y
    # Evaluate accuracy.
    tf.metrics.precision_at_thresholds(
        tf.constant(test_set.target),
        classifier.predict(input_fn=tf.constant(test_set.data)),
        thresholds=[0.5,0.4,0.6],
    )

if __name__ == "__main__":
    main()

问题在于，tf.metrics并没有解释＆＃34;预测＆＃34;位。我尝试过不同的方式来预测＆＃34;预测＆＃34;他们都返回错误。使用

tf.metrics.precision_at_thresholds(
        tf.constant(test_set.target),
        classifier.predict_classes(input_fn=get_test_inputs),
        thresholds=[0.5,0.4,0.6],
    )

给了我

TypeError: Expected binary or unicode string, got <generator object <genexpr> at 0x7ff38d2c5af0>

使用

tf.metrics.precision_at_thresholds(
        tf.constant(test_set.target),
        classifier.predict_classes(input_fn=tf.constant(test_set.data)),
        thresholds=[0.5,0.4,0.6],
    )

结果

TypeError: 'Tensor' object is not callable

并使用

tf.metrics.precision_at_thresholds(
        tf.constant(test_set.target),
        classifier.predict_classes(input_fn=test_set.data),
        thresholds=[0.5,0.4,0.6],
    )

输出

TypeError: 'numpy.ndarray' object is not callable

我甚至尝试过定义一个新矩阵＆＃34; feature_columns_matrix＆＃34;并从csv文件中将所有值粘贴到其中并运行＆＃34; classifier.predict_classes（input_fn = feature_columns_matrix）＆＃34;它也没有用。在测试集上运行到tf.metrics子例程时，如何传递网络输出层的值？

我已经在本网站上阅读了其他10个类似的问题，没有人帮助过我（只是因为你知道我并没有提出多余的问题）。任何帮助将不胜感激！感谢

更新：我发现正在运行

print(list(classifier.predict(input_fn=get_test_inputs)))

正确返回测试文件中每个样本的预测类。但是，这并不是我需要tf.metrics来评估准确性的原因，因为上面的命令会根据0.5阈值精确地返回类！它没有给出网络最后一层的实际输出。当我运行上面的命令时，我得到了这个：

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

但我真正需要的是当网络在测试集上运行时它产生的实际float32值。这样我可以将其提供给tf.metrics并测试不同的阈值。有人知道怎么做吗？

Answer 1

TypeError: 'numpy.ndarray' object is not callable

这意味着此时你有一个numpy数组，并且你试图使用它就好像它是一个函数。也就是说，你正在打电话给＆＃39;它具有arr(...)语法。要么你应该为它编制索引arr[...]，要么这个对象首先不应该是一个数组。

张量对象相同。

TypeError: Expected binary or unicode string, got <generator object <genexpr> at 0x7ff38d2c5af0>

意味着函数期望一个字符串作为参数，但你给它一些其他的东西（generator的概念对你来说可能太高级了。）

理想情况下，在Python编程时，您需要了解每个变量引用的内容，特别是哪种对象。是一个函数，一个字符串，一个数字，一个数组和一个tensorflow对象。如果您有这样的错误，则需要添加检查对象内容的诊断打印件。不要假设;测试

显然你在这个表达中尝试不同的东西：

classifier.predict_classes(input_fn=get_test_inputs)
classifier.predict_classes(input_fn=tf.constant(test_set.data))
classifier.predict_classes(input_fn=test_set.data)

这个功能接受什么？我猜test_set.dat是一个numpy数组。将其包含在tf.constant()中会将其转换为张量对象。虽然get_test_inputs是生成器/函数。

得到两个＆＃34;＆＃39; numpy.ndarray＆＃39;对象不可调用＆＃34;和＆＃34;＆＃39; Tensor＆＃39;对象不可调用＆＃34;

1 个答案: