使用matplotlib和numpy将图像和标签绘制为条形图

时间:2017-03-09 02:28:48

标签: python numpy matplotlib plot

因此,我必须将数据集中的图像分布绘制为条形图。我已经看过几种方法,但无济于事。

我有两个numpy数组:

X_train - 形状(20000,32,32,3) y_train - shape(20000)

labels - 标签索引到标签字符串的字典。大小50

因此X_train包含图像,y_train包含相应的标签索引

我需要在50个标签上绘制X_train的条形图。显示每个标签的图像数量分布。

我应该首先按照y_train中相应的索引对X_train数组中的图像进行分组吗?这如何适合matplotlib.bar API调用?

或者我应该使用numpy histogram API。

非常感谢任何帮助。

2 个答案:

答案 0 :(得分:0)

一种方法是使用histogram和其他一些参数。你可以使用类似的东西,

In [55]: y
Out[55]: array([0, 0, 1, 2, 1])

In [54]: plt.hist(y, align='mid', range=(np.min(y), np.max(y)+1), bins=50)
In [55]: plt.xlabel("labels")
In [56]: plt.ylabel("image counts")
In [57]: plt.show()

enter image description here

这里,情节说标签0& 1出现2次,2出现一次。 从y_train获取标签并根据其计数进行绘制。您可以根据自己的标签随意更改垃圾箱数量。

答案 1 :(得分:0)

好吧,这篇文章是3年前发布的,也许您不再需要答案了,但这可能会对正在寻找答案的其他人有所帮助。这是在CIFAR10数据集上,并且训练数据集分为训练和验证:

from matplotlib import pyplot
from tensorflow.keras.datasets import cifar10
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.1, random_state=1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_val = x_val.astype('float32')


classes = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']

list_classes_train = []
list_classes_val = []

for i in range(len(np.unique(y_train))):

    idx_train = np.where(y_train == i)
    idx_train = idx_train[0]
    x_train, y_train = x_train[idx_train], y_train[idx_train]

    idx_val = np.where(y_val == i)
    idx_val = idx_val[0]
    x_val, y_val = x_val[idx_val], y_val[idx_val]

    #print("The training samples of class {} -> {} is {}" .format(i, classes[i], x_train.shape))
    #print("The validation samples of class {} -> {} is {}" .format(i, classes[i], x_val.shape))

    list_classes_train.append(x_train.shape[0])
    list_classes_val.append(x_val.shape[0])

    (x_train, y_train), (x_test, y_test) = cifar10.load_data()
    x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.1, random_state=1)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_val = x_val.astype('float32')

x = np.arange(len(classes))
width = 0.35

fig, ax = plt.subplots()

rects1 = ax.bar(x - width/2, list_classes_train, width, label='train')
rects2 = ax.bar(x + width/2, list_classes_val, width, label='val')

ax.set_ylabel('data points')
ax.set_title('Training and Validation set')
ax.set_xticks(x)
ax.set_xticklabels(classes, rotation = 90)
ax.legend()


def autolabel(rects):
    """Attach a text label above each bar in *rects*, displaying its height."""
    for rect in rects:
        height = rect.get_height()
        ax.annotate('{}'.format(height),
                    xy=(rect.get_x() + rect.get_width() / 2, height),
                    xytext=(0, 3),  # 3 points vertical offset
                    textcoords="offset points",
                    ha='center', va='bottom')


autolabel(rects1)
autolabel(rects2)

fig.tight_layout()

plt.show()

结果是:

bar plot of training and Validation set

仅用于绘制训练集

y_pos = range(len(classes))
plt.bar(y_pos, list_classes_train)

# Rotation of the bars names

bars = ax.bar(y_pos, list_classes_train)
# plt.xticks(y_pos, classes, rotation=90)

for rect in bars:
    height = rect.get_height()
    plt.text(rect.get_x() + rect.get_width() / 2.0, height, '%d' % int(height), ha = 'center', va = 'bottom')

plt.show()

结果是

bar plot of training set