在TENSorflow的Tensorflow中为混淆矩阵添加文本标签

时间:2018-04-19 13:16:46

标签: python tensorflow machine-learning tensorboard confusion-matrix

我正在自定义Tensorflow示例retrain.py中的代码,通过添加额外的密集层,丢失,动量梯度下降等来训练我自己的图像。

我想在Tensorboard中添加一个混淆矩阵,所以我跟随this帖子的第一个答案(Jerod's)(我还尝试了第二个答案,但面临一些调试问题)并在{{{{{{{ 1}}功能。现在它看起来像:

add_evaluation_step

这给了我: confusion matrix

我的问题是如何将标签添加到行(实际类)和列(Predicted类)。得到类似的东西:

required confusion matrix

2 个答案:

答案 0 :(得分:3)

Jerod的answer几乎包含了您需要的所有内容,例如yauheni_selivonchyk的另一个answer,其中包含如何将自定义图像添加到Tensorboard。

这只是把所有东西放在一起的问题,即:

  1. 实现将绘制的图像传递给摘要(作为RGB阵列)的方法
  2. 实施将矩阵数据转换为美化混淆图像的方法
  3. 定义正在运行的评估操作以获取混淆矩阵数据(与其他指标一起)并准备占位符和摘要以接收绘制的图像
  4. 一起使用所有内容
  5. 1。实现将绘制的图像传递给摘要的方法

    import matplotlib
    import matplotlib.pyplot as plt
    import pandas as pd
    import seaborn as sns
    import numpy as np
    import tensorflow as tf
    
    # Inspired by yauheni_selivonchyk on SO (https://stackoverflow.com/a/42815564/624547)
    
    def get_figure(figsize=(10, 10), dpi=300):
        """
        Return a pyplot figure
        :param figsize:
        :param dpi:
        :return:
        """
        fig = plt.figure(num=0, figsize=figsize, dpi=dpi)
        fig.clf()
        return fig
    
    
    def fig_to_rgb_array(fig, expand=True):
        """
        Convert figure into a RGB array
        :param fig:         PyPlot Figure
        :param expand:      Flag to expand
        :return:            RGB array
        """
        fig.canvas.draw()
        buf = fig.canvas.tostring_rgb()
        ncols, nrows = fig.canvas.get_width_height()
        shape = (nrows, ncols, 3) if not expand else (1, nrows, ncols, 3)
        return np.fromstring(buf, dtype=np.uint8).reshape(shape)
    
    
    def figure_to_summary(fig, summary, place_holder):
        """
        Convert figure into TF summary
        :param fig:             Figure
        :param summary:         Summary to eval
        :param place_holder:    Summary image placeholder
        :return:                Summary
        """
        image = fig_to_rgb_array(fig)
        return summary.eval(feed_dict={place_holder: image})
    

    2。将矩阵数据转换为美化混淆图像

    (这是一个例子,但这取决于你想要的)

    def confusion_matrix_to_image_summary(confusion_matrix, summary, place_holder, 
                                          list_classes, figsize=(9, 9)):
        """
        Plot confusion matrix and return as TF summary
        :param matrix:          Confusion matrix (N x N)
        :param filename:        Filename
        :param list_classes:    List of classes (N)
        :param figsize:         Pyplot figsize for the confusion image
        :return:                /
        """
        fig = get_figure(figsize=(9, 9))
        df = pd.DataFrame(confusion_matrix, index=list_classes, columns=list_classes)
        ax = sns.heatmap(df, annot=True, fmt='.0%')
        # Whatever embellishments you want:
        plt.title('Confusion matrix')
        plt.xticks(rotation=90)
        plt.yticks(rotation=0)
        image_sum = figure_to_summary(fig, summary, place_holder)
        return image_sum
    

    3。定义您的评估操作&准备占位符

    # Inspired by Jerod's answer on SO (https://stackoverflow.com/a/42857070/624547)    
    def add_evaluation_step(result_tensor, ground_truth_tensor, num_classes, confusion_matrix_figsize=(9, 9)):
        """
        Sets up the evaluation operations, computing the running accuracy and confusion image
        :param result_tensor:               Output tensor
        :param ground_truth_tensor:         Target class tensor
        :param num_classes:                 Number of classes
        :param confusion_matrix_figsize:    Pyplot figsize for the confusion image
        :return:                            TF operations, summaries and placeholders (see usage below)
        """
        scope = "evaluation"
        with tf.name_scope(scope):
            predictions = tf.argmax(result_tensor, 1, name="prediction")
    
            # Streaming accuracy (lookup and update tensors):
            accuracy, accuracy_update = tf.metrics.accuracy(ground_truth_tensor, predictions, name='accuracy')
            # Per-batch confusion matrix:
            batch_confusion = tf.confusion_matrix(ground_truth_tensor, predictions, num_classes=num_classes,
                                                  name='batch_confusion')
    
            # Aggregated confusion matrix:
            confusion_matrix = tf.Variable(tf.zeros([num_classes, num_classes], dtype=tf.int32),
                                           name='confusion')
            confusion_update = confusion_matrix.assign(confusion_matrix + batch_confusion)
    
            # We suppose each batch contains a complete class, to directly normalize by its size:
            evaluate_streaming_metrics_op = tf.group(accuracy_update, confusion_update)
    
            # Confusion image from matrix (need to extend dims + cast to float so tf.summary.image renormalizes to [0,255]):
            confusion_image = tf.reshape(tf.cast(confusion_update, tf.float32), [1, num_classes, num_classes, 1])
    
            # Summaries:
            tf.summary.scalar('accuracy', accuracy, collections=[scope])
            summary_op = tf.summary.merge_all(scope)
    
            # Preparing placeholder for confusion image (so that we can pass the plotted image to it):
            #      (we basically pre-allocate a plot figure and pass its RGB array to a placeholder)
            confusion_image_placeholder = tf.placeholder(tf.uint8,
                                                         fig_to_rgb_array(get_figure(figsize=confusion_matrix_figsize)).shape)
            confusion_image_summary = tf.summary.image('confusion_image', confusion_image_placeholder)
    
        # Isolating all the variables stored by the metric operations:
        running_vars = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope=scope)
        running_vars += tf.get_collection(tf.GraphKeys.LOCAL_VARIABLES, scope=scope)
    
        # Initializer op to start/reset running variables
        reset_streaming_metrics_op = tf.variables_initializer(var_list=running_vars)
    
        return evaluate_streaming_metrics_op, reset_streaming_metrics_op, summary_op, confusion_image_summary, \
               confusion_image_placeholder, confusion_image
    

    4。把所有东西放在一起

    如何使用它的快速示例,尽管它需要适应您的训练程序等。

    classes = ["obj1", "obj2", "obj3"]
    num_classes = len(classes)
    model = your_network(...)
    
    evaluate_streaming_metrics_op, reset_streaming_metrics_op, summary_op,
    confusion_image_summary,  confusion_image_placeholder, confusion_image = \
    add_evaluation_step(model.output, model.target, num_classes)
    
    def evaluate(session, model, eval_data_gen):
        """
        Evaluate the model
        :param session:         TF session
        :param eval_data_gen:   Data to evaluate on
        :return:                Evaluation summaries for Tensorboard
        """
        # Resetting streaming vars:
        session.run(reset_streaming_metrics_op)
    
        # Evaluating running ops over complete eval dataset, e.g.:
        for batch in eval_data_gen:
            feed_dict = {model.inputs: batch}
            session.run(evaluate_streaming_metrics_op, feed_dict=feed_dict)
    
        # Obtaining the final results:
        summary_str, confusion_results = session.run([summary_op, confusion_image])
    
        # Converting confusion data into plot into summary:
        confusion_img_str = confusion_matrix_to_image_summary(
            confusion_results[0,:,:,0], confusion_image_summary, confusion_image_placeholder, classes)
        summary_str += confusion_img_str
    
        return summary_str # to be given to a SummaryWriter
    

答案 1 :(得分:0)

关注MLNINJA的回答帮助我只获得了标签,还有一个漂亮的实时流式可视化。我是怎么做到的。首先我把这个函数写入了retrain.py

from textwrap import wrap
import itertools
import matplotlib
import tfplot
import os
import re

def plot_confusion_matrix(correct_labels, predict_labels,labels,session, title='Confusion matrix', tensor_name = 'MyFigure/image', normalize=False):
  conf = tf.contrib.metrics.confusion_matrix(correct_labels, predict_labels)

  cm=session.run(conf)

  if normalize:
    cm = cm.astype('float')*10 / cm.sum(axis=1)[:, np.newaxis]
    cm = np.nan_to_num(cm, copy=True)
    cm = cm.astype('int')

  np.set_printoptions(precision=2)

  fig = matplotlib.figure.Figure(figsize=(7, 7), dpi=320, facecolor='w', edgecolor='k')
  ax = fig.add_subplot(1, 1, 1)
  im = ax.imshow(cm, cmap='Oranges')

  classes = [re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r'\1 ', x) for x in labels]
  classes = ['\n'.join(wrap(l, 40)) for l in classes]

  tick_marks = np.arange(len(classes))

  ax.set_xlabel('Predicted', fontsize=7)
  ax.set_xticks(tick_marks)
  c = ax.set_xticklabels(classes, fontsize=10, rotation=-90,  ha='center')
  ax.xaxis.set_label_position('bottom')
  ax.xaxis.tick_bottom()

  ax.set_ylabel('True Label', fontsize=7)
  ax.set_yticks(tick_marks)
  ax.set_yticklabels(classes, fontsize=10, va ='center')
  ax.yaxis.set_label_position('left')
  ax.yaxis.tick_left()

  for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
    ax.text(j, i, format(cm[i, j], 'd') if cm[i,j]!=0 else '.', horizontalalignment="center", fontsize=6, verticalalignment='center', color= "black")

  fig.set_tight_layout(True)
  summary = tfplot.figure.to_summary(fig, tag=tensor_name)
  return summary

在我的版本的retrain.py main 功能中,首先会在第1227行处创建一个摘要编写器conf__writer,用于创建混淆矩阵。然后在每个评估步骤调用的 if(第1261行)子句中调用该函数(在第1287行),最后将摘要写入第1288行的摘要目录中< / em>的

注意:还修改了add_evaluation_step函数以返回地面实况输入的张量。在第1278行中,这是为了获得基础事实输入的数组,它被送到plot_confusion_matrix函数。