具有Tensorflow的混淆矩阵

时间:2018-04-16 21:50:05

标签: tensorflow scikit-learn tensorboard tensorflow-datasets

我正在使用@kratzert在我自己的数据集上编写的微调AlexNet架构,它正常工作(我从这里得到了代码:https://github.com/kratzert/finetune_alexnet_with_tensorflow),我想弄清楚如何从他的代码构建混淆矩阵。我曾尝试使用tf.confusion_matrix(labels, predictions, num_classes)来构建混淆矩阵,但我无法做到。我很困惑标签和预测的值应该是什么,我的意思是,我知道应该是什么,但每次我提供这些值都会出错。任何人都可以帮我这个或看一下代码(上面的链接)并指导我吗?

我在finetune.py中添加了这两行,正好在计算精度之后将标签和预测作为类的编号。

with tf.name_scope("accuracy"):
    correct_pred = tf.equal(tf.argmax(score, 1), tf.argmax(y, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

**true_class = tf.argmax(y, 1)
predicted_class = tf.argmax(score, 1)**

我在保存模型检查点之前在最底层的会话中添加了tf.confusion_matrix()

for _ in range(val_batches_per_epoch):

    img_batch, label_batch = sess.run(next_batch)


    acc, cost = sess.run([accuracy, loss], feed_dict={x: img_batch,
                                                    y: label_batch,
                                                    keep_prob: 1.})
    test_acc += acc
    test_count += 1

test_acc /= test_count
print("{} Validation Accuracy = {:.4f} -- Validation Loss = {:.4f}".format(datetime.now(),test_acc, cost))

print("{} Saving checkpoint of model...".format(datetime.now()))

**print(sess.run(tf.confusion_matrix(true_class, predicted_class, num_classes)))**

# save checkpoint of the model
checkpoint_name = os.path.join(checkpoint_path,
                               'model_epoch'+str(epoch+1)+'.ckpt')
save_path = saver.save(sess, checkpoint_name)

print("{} Model checkpoint saved at {}".format(datetime.now(),
                                               checkpoint_name))

我也曾尝试过其他地方但每次都会收到错误:

Caused by op 'Placeholder_1', defined at:
  File "/home/armin/Desktop/Alexnet_DataPipeline/finetune.py", line 85, in <module>
    y = tf.placeholder(tf.float32, [batch_size, num_classes])
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/array_ops.py", line 1777, in placeholder
    return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 4521, in placeholder
    "Placeholder", dtype=dtype, shape=shape, name=name)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 3290, in create_op
    op_def=op_def)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/ops.py", line 1654, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'Placeholder_1' with dtype float and shape [128,3]

任何帮助将不胜感激,谢谢。

1 个答案:

答案 0 :(得分:1)

这是你所指的相当长的一段代码,而你没有指明你把混淆矩阵线放在哪里。

仅凭经验,混淆矩阵最常见的问题是tf.confusion_matrix()需要标签和预测作为类的数量,而不是单热矢量。换句话说,标签和预测应采用数字5而不是[0,0,0,0,0,1,0,0,0,0]的形式。

在您引用的代码中,y采用单热格式。网络的输出score是一个向量,给出了每个类的概率。这也不是必需的格式。你需要做一些像

这样的事情
true_class = tf.argmax( y, 1 )
predicted_class = tf.argmax( score, 1 )

并使用像

这样的混淆矩阵
tf.confusion_matrix( true_class, predicted_class, num_classes )

(基本上,如果你看一下finetune.py的第123行,它有两个用于确定准确性的元素,但它们不会保存在单独的张量中。)

如果你想保持所有批次的混乱矩阵的运行总数,你只需要将它们加起来 - 因为矩阵的每个单元格都会计算落入该类别的示例数量,因此元素添加会产生混淆整套的矩阵:

cm_running_total = None
cm_nupmy_array = sess.run(tf.confusion_matrix(true_class, predicted_class, num_classes), feed_dict={x: img_batch, y: label_batch, keep_prob: 1.} )
if cm_running_total is None:
    cm_running_total = cm_numpy_array
else:
    cm_running_total += cm_numpy_array