训练和验证准确度高,但预测不佳

时间:2020-04-08 09:04:54

标签: python keras deep-learning cnn imbalanced-data

我正在训练一个预训练模型来预测医学图像的临床意义。我获得了良好的培训和验证准确性(大约90%),但是测试准确性却很差(预测准确性为50%)。我用来训练模型的数据集有3000个带有“ True”或“ False”标签的样本。此数据集不平衡,False的比例是True样本的比例的3倍。因此,在将模型调整为1:3的比例时,我设置了class_weight。

这是代码

from keras import models
from keras import layers
from keras.utils import np_utils, generic_utils
from keras.utils import to_categorical
from keras.preprocessing.image import ImageDataGenerator
from keras.applications import VGG19
from sklearn.model_selection import KFold
from keras.optimizers import adam
import pickle
import numpy as np

### CNN model and training ###
base_model = VGG19(weights='imagenet', include_top=False, input_shape=(224,224,3))

# Define the K-fold Cross Validator
kfold = KFold(n_splits=5, shuffle=True)
acc_per_fold = []       # Define per-fold score containers
loss_per_fold = []      # Define per-fold score containers
fold_no = 1
for train, val in kfold.split(xtrain, ytrain):
    X_train = xtrain[train]
    X_val = xtrain[val]
    X_train = X_train.astype('float32')
    X_val = X_val.astype('float32')
    y_train = ytrain[train]
    y_val = ytrain[val]


 ### Model architecture
    model = models.Sequential()
    model.add(base_model)
    model.add(layers.Conv2D(64,(3,3), activation = 'relu', padding = 'same'))                                               
    model.add(layers.MaxPooling2D((2,2), strides=(2,2)))
    model.add(layers.Flatten())
    model.add(layers.Dense(256, activation='relu'))
    model.add(layers.Dense(256, activation='relu'))
    model.add(layers.Dense(2, activation='softmax'))
    for layer in base_model.layers:
        layer.trainable = False
    model.summary()

    ### Compile the model
    opt = adam(lr=0.0001)
    model.compile(loss='categorical_crossentropy',
                  optimizer=opt,
                  metrics=['accuracy'])
    print('------------------------------------------------------------------------')
    print(f'Training for fold {fold_no} ...')

    Epochs_no = 20
    BS = 64
    CB = CSVLogger('VGG19_Test1_Fold'+str(fold_no)+'_CB.csv',separator = ',', append=False)
    class_weights = {0:1, 1:3}

    # Train the model
    history = model.fit(X_train, y_train, epochs=Epochs_no, batch_size=BS,validation_data=(X_val,y_val), callbacks = [CB], class_weight = class_weights, verbose=1)

模型摘要

Model: "sequential_1"
_________________________________________________________________ 
Layer (type)                 Output Shape              Param #
=================================================================
vgg19 (Model)                (None, 7, 7, 512)         20024384
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 5, 5, 64)          294976
_________________________________________________________________                                                       max_pooling2d_1 (MaxPooling2 (None, 2, 2, 64)          0
_________________________________________________________________
flatten_1 (Flatten)          (None, 256)               0
_________________________________________________________________
dense_1 (Dense)              (None, 256)               65792
_________________________________________________________________
dense_2 (Dense)              (None, 256)               65792
_________________________________________________________________ 
dense_3 (Dense)              (None, 2)                 514
=================================================================
Total params: 20,451,458
Trainable params: 427,074
Non-trainable params: 20,024,384
_________________________________________________________________

我的测试数据集采用与训练数据相同的方法进行预处理(重采样,归一化...)。测试数据的一半为True,另一半为False。但是,即使我已经设置了class_weight,经过训练的模型仍会预测更多“错误”,因此该模型应在训练中更加注意少数群体(真实的阶级)。

我不知道为什么该模型无法通过测试,任何人都可以对这个问题提出一些建议。非常感谢!

0 个答案:

没有答案