广播来自生成器的标签数据的问题

时间:2019-12-19 00:00:14

标签: python tensorflow machine-learning image-processing keras

我正在研究CK +数据集以进行面部表情识别,并且正在通过datagen.flow_from_directory传递面部图像和标签以提取面部特征并映射到标签。

标签以分类值的形式传递,并且范围从0到7。相同的符号似乎以一键编码形式传递。我的问题是我可以将标签值广播为一键编码的值。

我收到以下错误: ValueError: could not broadcast input array from shape (32,8) into shape (32)

代码如下:

import scipy
import os, shutil
from tensorflow.keras.preprocessing.image import ImageDataGenerator

img_width, img_height = 224, 224

datagen = ImageDataGenerator(rescale=1./255)
batch_size = 32

def extract_features(directory, sample_count):
    features = np.zeros(shape=(sample_count, 7, 7, 512))  # Must be equal to the output of the convolutional base
    labels = np.zeros(shape=(sample_count))
    print(sample_count, 7, 7, 512)
    # Preprocess data - flow_from_directory allows us to extract 
    #... features and labels directly from a directory
    generator = datagen.flow_from_directory(directory,
                                            target_size=(img_width,img_height),
                                            batch_size = batch_size,
                                            class_mode='categorical')

    i = 0
    for inputs_batch, labels_batch in generator:
        features_batch = conv_base.predict(inputs_batch)
        features[i * batch_size: (i + 1) * batch_size] = features_batch
        labels[i * batch_size: (i + 1) * batch_size] = labels_batch
        i += 1
        if i * batch_size >= sample_count:
            break
    return features, labels

我得到以下形状:

Found 209 images belonging to 8 classes.
Input batch shape:  (32, 224, 224, 3)
Features batch shape:  (32, 7, 7, 512)
Features shape:  (209, 7, 7, 512)
Labels batch shape:  (32, 8)

所以我对为什么可以广播features_batch而不能广播labels_batch感到困惑。

我尝试了几件事,其中包括:

1)使标签数组变平-这没有意义,只是为了查看,我得到了 32 * 8 = 259 的行和列的完整元素计数(如预期的那样)。

2)我尝试仅使用labels[i]=labels_batchlabels=labels_batch来返回最后几个标签 (17,从209-(6 * 32)= 17剩下来)

3)我还尝试从this question插入另一个解决方案。 通过这样做:

for c in range(0,7):
            labels[i * batch_size: (i + 1) * batch_size, [c]] = labels_batch

但是出现以下错误:

ValueError: Error when checking input: expected input_3 to have 4 dimensions, but got array with shape (32, 8)

我觉得我所缺少的很简单,但是我似乎无法弄清楚。可能有人有什么想法吗?

谢谢!

1 个答案:

答案 0 :(得分:1)

您的标签应为labels = np.zeros(shape=(sample_count, num_classes))而不是labels = np.zeros(shape=(sample_count)) 并应从生成器分配标签  labels[i * batch_size: (i + 1) * batch_size,:] = labels_batch