Keras CNN的准确性很高,但预测错误。如何改善

时间:2019-09-20 17:34:19

标签: python machine-learning image-processing keras conv-neural-network

我必须确定CNN引起的视网膜疾病。我有1400张图像,每节课700张。我的课程是(0-无PDR)和(1-PDR)。我正在尝试建立一个模型来识别输入视网膜是否患有4级疾病。

我正在对图像进行跟踪操作,然后将所有图像调整为256x256:

ImageCV[index] = cv2.addWeighted(ImageCV[index],4, cv2.GaussianBlur(ImageCV[index],(0,0), 256/30), -4, 128)

它用我的img进行了跟踪: https://imgur.com/X1p9G1c

然后,当我训练模型时,我得到了非常高的准确性(例如99 ....),但是当我尝试预测一些测试图像时,它就失败了。例如,我在测试中放置了10个PDR示例文件夹并尝试预测它们(都必须为1)。这是结果:

[[0.]]
[[0.]]
[[1.]]
[[0.]]
[[0.]]
[[0.]]
[[1.]]
[[0.]]
[[0.]]
[[0.]]

这是我的模特

visible = Input(shape=(256,256,3))
conv1 = Conv2D(16, kernel_size=(3,3), activation='relu', strides=(1, 1))(visible)
conv2 = Conv2D(16, kernel_size=(3,3), activation='relu', strides=(1, 1))(conv1)
bat1 = BatchNormalization()(conv2)
conv3 = ZeroPadding2D(padding=(1, 1))(bat1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv3)

conv4 = Conv2D(32, kernel_size=(3,3), activation='relu', padding='valid', kernel_regularizer=regularizers.l2(0.01))(pool1)
conv5 = Conv2D(32, kernel_size=(3,3), activation='relu', padding='valid', kernel_regularizer=regularizers.l2(0.01))(conv4)
bat2 = BatchNormalization()(conv5)
pool2 = MaxPooling2D(pool_size=(1, 1))(bat2)

conv6 = Conv2D(64, kernel_size=(3,3), activation='relu',strides=(1, 1), padding='valid')(pool2)
conv7 = Conv2D(64, kernel_size=(3,3), activation='relu',strides=(1, 1), padding='valid')(conv6)
bat3 = BatchNormalization()(conv7)
conv7 = ZeroPadding2D(padding=(1, 1))(bat3)
pool3 = MaxPooling2D(pool_size=(1, 1))(conv7)

conv8 = Conv2D(128, kernel_size=(3,3), activation='relu', padding='valid', kernel_regularizer=regularizers.l2(0.01))(pool3)
conv9 = Conv2D(128, kernel_size=(2,2), activation='relu', strides=(1, 1), padding='valid')(conv8)
bat4 = BatchNormalization()(conv9)
pool4 = MaxPooling2D(pool_size=(1, 1))(bat4)

flat = Flatten()(pool4)

output = Dense(1, activation='sigmoid')(flat)
model = Model(inputs=visible, outputs=output)

opt = optimizers.adam(lr=0.001, decay=0.0)

model.compile(optimizer= opt, loss='binary_crossentropy', metrics=['accuracy'])


data, labels = ReadImages(TRAIN_DIR)

test, lt = ReadImages(TEST_DIR)

data = np.array(data)
labels = np.array(labels)
test = np.array(test)
lt = np.array(lt)

np.random.permutation(len(data))
np.random.permutation(len(labels))
np.random.permutation(len(test))
np.random.permutation(len(lt))

model.fit(data, labels, epochs=7, validation_data = (test,lt))

model.save('model.h5')

这是预测。py

model = load_model('model.h5')

for filename in os.listdir(r'v/'):
    if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg"):
        ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (256,256))
        ImageCV = cv2.addWeighted(ImageCV,4, cv2.GaussianBlur(ImageCV,(0,0), 256/30), -4, 128)
        cv2.imshow('image', ImageCV)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
        ImageCV = ImageCV.reshape(-1,256,256,3)
        print(model.predict(ImageCV))

我该怎么做才能改善我的预测? 非常感谢您的帮助

更新 好吧,我尝试着回答所有问题,但仍然无法正常工作... 这是我的代码:

visible = Input(shape=(256,256,3))
conv1 = Conv2D(16, kernel_size=(3,3), activation='relu', strides=(1, 1))(visible)
conv2 = Conv2D(32, kernel_size=(3,3), activation='relu', strides=(1, 1))(conv1)
bat1 = BatchNormalization()(conv2)
conv3 = ZeroPadding2D(padding=(1, 1))(bat1)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv3)
drop1 = Dropout(0.30)(pool1)

conv4 = Conv2D(32, kernel_size=(3,3), activation='relu', padding='valid', kernel_regularizer=regularizers.l2(0.01))(drop1)
conv5 = Conv2D(64, kernel_size=(3,3), activation='relu', padding='valid', kernel_regularizer=regularizers.l2(0.01))(conv4)
bat2 = BatchNormalization()(conv5)
pool2 = MaxPooling2D(pool_size=(1, 1))(bat2)
drop1 = Dropout(0.30)(pool2)

conv6 = Conv2D(128, kernel_size=(3,3), activation='relu', padding='valid', kernel_regularizer=regularizers.l2(0.01))(pool2)
conv7 = Conv2D(128, kernel_size=(2,2), activation='relu', strides=(1, 1), padding='valid')(conv6)
bat3 = BatchNormalization()(conv7)
pool3 = MaxPooling2D(pool_size=(1, 1))(bat3)
drop1 = Dropout(0.30)(pool3)

flat = Flatten()(pool3)
drop4 = Dropout(0.50)(flat)

output = Dense(1, activation='sigmoid')(drop4)
model = Model(inputs=visible, outputs=output)

opt = optimizers.adam(lr=0.001, decay=0.0)

model.compile(optimizer= opt, loss='binary_crossentropy', metrics=['accuracy'])

data, labels = ReadImages(TRAIN_DIR)
test, lt = ReadImages(TEST_DIR)

data = np.array(data)
labels = np.array(labels)

perm = np.random.permutation(len(data))
data = data[perm]
labels = labels[perm]
#model.fit(data, labels, epochs=8, validation_data = (np.array(test), np.array(lt)))

aug = ImageDataGenerator(rotation_range=20, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15,
    horizontal_flip=True)

# train the network
model.fit_generator(aug.flow(data, labels, batch_size=32),
    validation_data=(np.array(test), np.array(lt)), steps_per_epoch=len(data) // 32,
    epochs=7)

这是回报:

Epoch 1/7
43/43 [==============================] - 1004s 23s/step - loss: 1.8090 - acc: 0.9724 - val_loss: 1.7871 - val_acc: 0.9861
Epoch 2/7
43/43 [==============================] - 1003s 23s/step - loss: 1.8449 - acc: 0.9801 - val_loss: 1.4828 - val_acc: 1.0000
Epoch 3/7
43/43 [==============================] - 1092s 25s/step - loss: 1.5704 - acc: 0.9920 - val_loss: 1.3985 - val_acc: 1.0000
Epoch 4/7
43/43 [==============================] - 1062s 25s/step - loss: 1.5219 - acc: 0.9898 - val_loss: 1.3167 - val_acc: 1.0000
Epoch 5/7
43/43 [==============================] - 990s 23s/step - loss: 2.5744 - acc: 0.9222 - val_loss: 2.9347 - val_acc: 0.9028
Epoch 6/7
43/43 [==============================] - 983s 23s/step - loss: 1.6053 - acc: 0.9840 - val_loss: 1.3299 - val_acc: 1.0000
Epoch 7/7
43/43 [==============================] - 974s 23s/step - loss: 1.6180 - acc: 0.9801 - val_loss: 1.5181 - val_acc: 0.9861

我会添加辍学对象,减少模型层,增加数据扩充,并且根本不起作用(所有预测都返回0)...

请任何人都可以提供帮助。

3 个答案:

答案 0 :(得分:0)

似乎您在过拟合方面遇到问题。无论是否是题外话,我在这里都会遇到个人困境,因为这种方法可以基于某种观点,但是我在这里: 首先,如果需要调整过度拟合的网络,则要使用dropout开始的0.25并检查是否可以改善模型。在处理过度拟合时,Data augmentationbatch normalization(正在应用)一起是必须的。 如果这仍然不能解决您的过拟合问题,那么您应该尝试对网络体系结构进行调整以使其更好地推广。您是否检查了用于培训和测试的简单输入?

TLDR:尝试进行数据丢失和数据扩充,如果它不起作用并且您的数据正确,那么您可能必须进行改进体系结构以建立更好的泛化模型。

编辑:采用这种模型的共识是,首先以适当的准确性使其过拟合,然后在不损失准确性的情况下对其进行概括。

答案 1 :(得分:0)

您的模型非常适合您的训练集大小,从而导致过拟合。减少图层数量

答案 2 :(得分:0)

您拟合过度,即存在高方差问题。尝试在卷积块的末尾添加一些软衰减(0.2-0.3)。您还可以在输出层之前添加几个单位数量递减的密集层,并在这些层之间(0.5+)放置更多的退出层。

您还应该实现更多的数据增强功能,例如旋转,翻转,随机噪声,随机亮度等。请查阅Keras的ImageDataGenerator类文档。