Keras ImageDataGenerator:为什么我的CNN的输出颠倒了?

时间:2019-01-26 10:10:13

标签: python pandas machine-learning keras deep-learning

我正在尝试编写区分猫和狗的CNN。 我将标签设置为dog:0和cat:1,因此我希望CNN如果是狗则输出0,如果是猫则输出1。但是,它反而相反(给它的猫是0,给狗1则是0)。请查看我的代码,看看我哪里出错了。谢谢

我目前正在使用jupyter笔记本使用python 3.6.8(其中所有代码都是我从jupyter笔记本中复制粘贴代码的不同部分)

import os
import cv2
from random import shuffle
import numpy as np
from keras.preprocessing.image import ImageDataGenerator, load_img, img_to_array
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Activation, Conv2D, MaxPooling2D, Flatten, Dropout, BatchNormalization
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
%matplotlib inline

train_dir = r'C:\Users\tohho\Desktop\Python pypipapp\Machine Learning\data\PetImages\train'
test_dir = r'C:\Users\tohho\Desktop\Python pypipapp\Machine Learning\data\PetImages\test1'
IMG_WIDTH = 100
IMG_HEIGHT = 100
batch_size = 32



######## THIS IS WHERE I LABELLED 0 FOR DOG AND 1 FOR CAT ##########
filenames = os.listdir(train_dir)
categories = [] 
for filename in filenames:
    category = filename.split('.')[0]
    if category == 'cat':
        categories.append(1)
    elif category == 'dog':
        categories.append(0)

df = pd.DataFrame({'filename':filenames, 'class':categories}) # making the dataframe

#### I SPLIT THE DATA INTO TRAIN AND VALIDATION DATASETS ####
df_train, df_validate = train_test_split(df, test_size=0.25) # splitting data for train/test
 # need to reset index for both dataframs so imagedatagenerator works properly
df_train = df_train.reset_index(drop=True)
df_validate = df_validate.reset_index(drop=True)

print(df_train['class'].value_counts())
print(df_validate['class'].value_counts())

len_training = df_train.shape[0]
len_validate = df_validate.shape[0]
print('{} training eg, {} test eg'.format(len_training, len_validate))



#### CREATE IMAGE DATA GENERATORS ####
train_datagen = ImageDataGenerator(rescale=1./255,
                               shear_range = 0.2,
                               zoom_range = 0.2,
                               horizontal_flip = True)
# our train_datagen generator will use the following transformations on the images
validation_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_dataframe(df_train, 
                                                    train_dir,
                                                    target_size=(IMG_WIDTH, IMG_HEIGHT),
                                                    batch_size=batch_size,
                                                    x_col='filename',
                                                    y_col='class',
                                                    class_mode = 'binary')

# generator = ImageDataGenerator(*args).flow_from_dataframe(dataframe, directory, target_size,
# batch_size, x_col, y_col, class_mode)
# your dataframe shoudl be in the format such that x_col = features, y_col = class/label
# binary class mode since output is either 0(dog) or 1(cat)

validation_generator = validation_datagen.flow_from_dataframe(df_validate, 
                                                   train_dir,
                                                    target_size=(IMG_WIDTH, IMG_HEIGHT),
                                                    x_col='filename',
                                                    y_col='class',
                                                    class_mode='binary', 
                                                  batch_size=batch_size)

########## BUILDING MODEL ############
model = Sequential()
model.add(Conv2D(32, (3,3), input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(64, (3,3), input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Conv2D(128, (3,3), input_shape=(IMG_WIDTH, IMG_HEIGHT, 3)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))

model.add(Flatten()) # remember to flatten conv2d to dense layer
model.add(Dense(256))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.4))

model.add(Dense(1))
model.add(Activation('sigmoid')) 
# since we have only 1 output with range [0,1], we use sigmoid
# if there were n categories, use softmax

# binary_crossentropy since output is either 0,1
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()

earlystop = EarlyStopping(monitor='val_loss', patience=3) # stops learning if val_loss doesnt improve
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', 
                                            patience=2, 
                                            verbose=1, 
                                            factor=0.5, 
                                            min_lr=0.000001) 
# reduces learning rate if val_acc doesnt improve
callbacks = [earlystop, learning_rate_reduction]

##### FIT THE MODEL #####
epochs = 50
model.fit_generator(train_generator,
                   steps_per_epoch=len_training//batch_size,
                   verbose=1,
                   epochs=epochs,
                   validation_data=validation_generator,
                   validation_steps=len_validate//batch_size,
                   callbacks=callbacks) # fitting model


######### PREDICTING #############
output_generator = validation_datagen.flow_from_dataframe(df_output,
                                                   outputdir,
                                                   x_col='filename',
                                                   y_col=None,
                                                   class_mode=None,
                                                   target_size=(IMG_WIDTH, IMG_HEIGHT),
                                                   shuffle=False,
                                                   batch_size=batch_size)
predictions = model.predict_generator(output_generator, 
                                      steps=np.ceil(len_output/batch_size))
df_output['probability'] = predictions
df_output['label'] = np.where(df_output['probability'] > 0.5, 'cat','dog')
df_output.head()

CNN给出了正确答案的反面,当反转输出时,我得到了预期的结果(正确的标识和准确性)。 我知道只需将行df_output['label'] = np.where(df_output['probability'] > 0.5, 'cat','dog')更改为df_output['label'] = np.where(df_output['probability'] < 0.5, 'cat','dog')就可以解决问题,但这无助于我弄清楚为什么CNN的输出会反转。

1 个答案:

答案 0 :(得分:4)

问题的原因很微妙。我将用一个玩具示例说明发生了什么。假设我们使用以下代码实例化数据生成器:

# List of image paths, doesn't matter here
image_paths = ['./img_{}.png'.format(i) for i in range(5)] 
labels = ...  # List of labels

df = pd.DataFrame()
df['filename'] = image_paths
df['class'] = labels

generator = ImageDataGenerator().flow_from_dataframe(dataframe=df, 
                                                    directory='./',
                                                    x_col='filename',
                                                    y_col='class')

ImageDataGenerator希望数据帧中的class列包含与图像关联的字符串标签。在内部,它将这些标签映射到类整数。您可以通过调用class_indices属性来检查此映射。使用以下标签列表实例化生成器后:

labels = ['cat', 'cat', 'cat', 'dog', 'dog']

class_indices映射如下所示:

generator.class_indices
> {'cat': 0, 'dog': 1}

让我们再次实例化生成器,但是更改第一张图像的标签:

labels = ['dog', 'cat', 'cat', 'dog', 'dog']
# After re-instantiating the generator
generator.class_indices
> {'dog': 0, 'cat': 1}

我们的类的整数编码被交换,这表明标签到类整数的内部映射取决于遇到不同类的顺序

您正在将cat映射为1,将dog映射为0,但是ImageDataGenerator会将其解释为标签字符串,并将其内部映射为整数。

现在,如果目录中的第一张图片是猫,会发生什么?

labels = [1, 0, 1, 0, 0] # ['cat', 'dog', 'cat', 'dog', 'dog']
# After re-instantiating the generator
generator.class_indices
> {1: 0, 0: 1}  # !

这就是您困惑的根源。 :)为避免这种情况,可以:

  • 在数据框的标签列中使用“ cat”和“ dog”,然后让ImageDataGenerator为您处理映射
  • 在对的调用中将类列表传递给classes参数 flow_from_dataframe来明确指定映射。