用于重新训练的验证中的Tensorflow混淆矩阵

时间:2018-05-29 11:44:14

标签: python tensorflow machine-learning deep-learning

我一直在使用github tensorflow hub上的retain示例,并且在尝试添加这两项内容时遇到了一些问题:

  1. 基于最终测试结果的混淆矩阵
  2. 一种记录测试集中每个评估时间的方法,将其添加到数组
  3. This is the link to the retrain example

    混淆矩阵

    对于混淆矩阵,我将run eval功能更改为以下

    def run_final_eval(train_session, module_spec, class_count, image_lists,
                   jpeg_data_tensor, decoded_image_tensor,
                   resized_image_tensor, bottleneck_tensor):
    #Runs a final evaluation on an eval graph using the test data set.
    
    Args:
    
    
       train_session: Session for the train graph with the tensors below.
        module_spec: The hub.ModuleSpec for the image module being used.
        class_count: Number of classes
        image_lists: OrderedDict of training images for each label.
        jpeg_data_tensor: The layer to feed jpeg image data into.
        decoded_image_tensor: The output of decoding and resizing the image.
        resized_image_tensor: The input node of the recognition graph.
        bottleneck_tensor: The bottleneck output layer of the CNN graph.
    
      test_bottlenecks, test_ground_truth, test_filenames = (
          get_random_cached_bottlenecks(train_session, image_lists,
                                        FLAGS.test_batch_size,
                                        'testing', FLAGS.bottleneck_dir,
                                        FLAGS.image_dir, jpeg_data_tensor,
                                        decoded_image_tensor, resized_image_tensor,
                                        bottleneck_tensor, FLAGS.tfhub_module))
    
      (eval_session, _, bottleneck_input, ground_truth_input, evaluation_step,
       prediction) = build_eval_session(module_spec, class_count)
      test_accuracy, predictions = eval_session.run(
          [evaluation_step, prediction],
          feed_dict={
              bottleneck_input: test_bottlenecks,
              ground_truth_input: test_ground_truth
          })
      tf.logging.info('Final test accuracy = %.1f%% (N=%d)' %
                      (test_accuracy * 100, len(test_bottlenecks)))
    
      confusion = tf.confusion_matrix(labels=test_ground_truth, predictions=predictions,num_classes=class_count)
      print(confusion)
    
      if FLAGS.print_misclassified_test_images:
        tf.logging.info('=== MISCLASSIFIED TEST IMAGES ===')
        for i, test_filename in enumerate(test_filenames):
          if predictions[i] != test_ground_truth[i]:
            tf.logging.info('%70s  %s' % (test_filename,
                                          list(image_lists.keys())[predictions[i]]))
    

    输出结果为:

    INFO:tensorflow:Final test accuracy = 88.5% (N=710)
    INFO:tensorflow:=== CONwaka ===
    Tensor("confusion_matrix/SparseTensorDenseAdd:0", shape=(5, 5), dtype=int32)
    

    我也尝试使用tf.logging.info获得相同的结果。我想以数组形式打印出来。我发现这个Answer by MLninja似乎也是一个更好的解决方案,但我无法弄清楚如何在重新训练文件中实现它。

    非常感谢任何帮助!

1 个答案:

答案 0 :(得分:0)

您需要评估混淆矩阵张量。现在您将混淆矩阵操作添加到图形并打印操作,但您想要的是打印操作的结果,即矩阵。在代码中,它看起来像这样:

confusion_matrix_np = eval_session.run(
  confusion,
  feed_dict={
      bottleneck_input: test_bottlenecks,
      ground_truth_input: test_ground_truth
  })

print(confusion_matrix_np)