Keras.predict总是将“1.”作为输出

时间:2018-05-24 20:47:47

标签: python tensorflow keras

我训练了一个二元分类器,将明显的MNIST图像与模糊图像区分开来。所有图像都是28 * 28 * 1灰度数字,我有40000用于训练,10000用于验证,8000用于测试。我的代码如下:

from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
import cv2
import numpy as np
import glob
from PIL import Image

img_width, img_height = 28, 28#all MNIST images are of size (28*28)

train_data_dir = '/Binary Classifier/data/train'#train directory generated by train_cla
validation_data_dir = '/Binary Classifier/data/val'#validation directory generated by val_cla
train_samples = 40000
validation_samples = 10000
epochs = 20
batch_size = 16

if K.image_data_format() == 'channels_first':
    input_shape = (1, img_width, img_height)
else:
    input_shape = (img_width, img_height, 1)

#build a sequential model to train data
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=input_shape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

train_datagen = ImageDataGenerator(#train data generator
    rescale=1. / 255,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1. / 255)#validation data generator

train_generator = train_datagen.flow_from_directory(#train generator
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary',color_mode = 'grayscale')

validation_generator = val_datagen.flow_from_directory(#validation generator
    validation_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary',color_mode = 'grayscale')

model.fit_generator(#fit the generator to train and validate the model
    train_generator,
    steps_per_epoch=train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=validation_samples // batch_size)

#model.save_weights('output.h5')#save the output as HDF5 file
filelist = glob.glob('/Binary Classifier/data/image_data/*.png')
x = np.array([np.array(Image.open(fname)) for fname in filelist])
x = np.expand_dims(x, axis=3)
ones=model.predict(x)

但是我在[]中的输出预测都是[1.],而训练的准确性实际上非常高(几乎完美)。有谁知道为什么?

编辑:如果我可以显示我的图像数据,我想我可能会得到更多帮助。基本上,目录中的MNIST图像是clear digit(清除)或blurry digit(模糊)。所有都是(28 * 28 * 1)灰度图像,格式为.png。 '/Binary Classifier/data/train'用于培训有40000个数字,'/Binary Classifier/data/val'用于验证有10000个数字,'/Binary Classifier/data/image_data/有58000个用于测试。

1 个答案:

答案 0 :(得分:0)

一些建议:

  • 直接从您的某个生成器中提取数据并对其进行测试。像对待for循环中的列表一样处理生成器以获得图像/标签对。这应该排除您获取数据及其格式(例如渠道订单)的方式上的任何差异。
  • 检查train /和val /.
  • 的每个子目录中有多少个示例
  • 将您的指标更改为binary_accuracy,因为您将问题视为二进制分类问题(网络只有一个输出)。