我正在尝试对美国手语进行字母分类。因此,这是具有26个类的多类分类任务。我的CNN模型提供了84%的训练准确度和91%的验证准确度,但测试准确度却低得可笑-仅7.7%!!!
我使用ImageDataGenerator
来生成训练和验证数据:
datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=0.2,
width_shift_range=0.05,
height_shift_range=0.05,
shear_range=0.05,
horizontal_flip=True,
fill_mode='nearest',
validation_split=0.2)
img_height = img_width = 256
batch_size = 16
source = '/home/hp/asl_detection/train'
train_generator = datagen.flow_from_directory(
source,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
subset='training', # set as training data
color_mode='grayscale',
seed=42,
)
validation_generator = datagen.flow_from_directory(
source,
target_size=(img_height, img_width),
batch_size=batch_size,
shuffle=True,
class_mode='categorical',
subset='validation', # set as validation data
color_mode='grayscale',
seed=42,
)
这是我的模型代码:
img_rows = 256
img_cols = 256
def get_net():
inputs = Input((img_rows, img_cols, 1))
print("inputs shape:",inputs.shape)
#Convolution layers
conv1 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
print("conv1 shape:",conv1.shape)
conv2 = Conv2D(24, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
print("conv2 shape:",conv2.shape)
pool1 = MaxPooling2D(pool_size=(2, 2))(conv2)
print("pool1 shape:",pool1.shape)
drop1 = Dropout(0.25)(pool1)
conv3 = Conv2D(36, 3, strides=(2, 2), activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop1)
print("conv3 shape:",conv3.shape)
conv4 = Conv2D(36, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
print("conv4 shape:",conv4.shape)
pool2 = MaxPooling2D(pool_size=(2, 2))(conv4)
print("pool2 shape:",pool2.shape)
drop2 = Dropout(0.25)(pool2)
conv5 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(drop2)
print("conv5 shape:",conv5.shape)
conv6 = Conv2D(48, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
print("conv6 shape:",conv6.shape)
pool3 = MaxPooling2D(pool_size=(2, 2))(conv6)
print("pool3 shape:",pool3.shape)
drop3 = Dropout(0.25)(pool3)
#Flattening
flat = Flatten()(drop3)
#Fully connected layers
dense1 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(flat)
print("dense1 shape:",dense1.shape)
drop4 = Dropout(0.5)(dense1)
dense2 = Dense(128, activation = 'relu', use_bias=True, kernel_initializer = 'he_normal')(drop4)
print("dense2 shape:",dense2.shape)
drop5 = Dropout(0.5)(dense2)
dense4 = Dense(26, activation = 'softmax', use_bias=True, kernel_initializer = 'he_normal')(drop5)
print("dense4 shape:",dense4.shape)
model = Model(input = inputs, output = dense4)
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=0.00000001, decay=0.0)
model.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics = ['accuracy'])
return model
这是培训代码:
def train():
model = get_net()
print("got model")
model.summary()
model_checkpoint = ModelCheckpoint('seqnet.hdf5', monitor='loss',verbose=1, save_best_only=True)
print('Fitting model...')
history = model.fit_generator(
train_generator,
steps_per_epoch = train_generator.samples // batch_size,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batch_size,
epochs = 100)
# list all data in history
print(history.history.keys())
# summarize history for accuracy
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'validation'], loc='upper left')
plt.show()
return model
model = train()
这是最近几个时期的训练日志:
Epoch 95/100
72/72 [==============================] - 74s 1s/step - loss: 0.4326 - acc: 0.8523 - val_loss: 0.2198 - val_acc: 0.9118
Epoch 96/100
72/72 [==============================] - 89s 1s/step - loss: 0.4591 - acc: 0.8418 - val_loss: 0.1944 - val_acc: 0.9412
Epoch 97/100
72/72 [==============================] - 90s 1s/step - loss: 0.4387 - acc: 0.8533 - val_loss: 0.2802 - val_acc: 0.8971
Epoch 98/100
72/72 [==============================] - 106s 1s/step - loss: 0.4680 - acc: 0.8349 - val_loss: 0.2206 - val_acc: 0.9228
Epoch 99/100
72/72 [==============================] - 85s 1s/step - loss: 0.4459 - acc: 0.8427 - val_loss: 0.2861 - val_acc: 0.9081
Epoch 100/100
72/72 [==============================] - 74s 1s/step - loss: 0.4639 - acc: 0.8472 - val_loss: 0.2866 - val_acc: 0.9191
dict_keys(['val_loss', 'loss', 'acc', 'val_acc'])
这些是模型准确度和损失的曲线:
与培训和验证数据不同,我没有使用ImageDataGenerator
来准备测试数据。对于测试数据,我使用OpenCV
将图像转换为灰度,然后进行了归一化处理。在同一循环中,我生成了图像的相应标签以防止任何顺序不匹配。我将图像文件名和标签保存在一个csv文件中。这是代码:
source = '/home/hp/asl_detection/test/unknown'
files = os.listdir(source)
test_data = []
rows = []
for file in files:
row = []
row.append(file)
row.append(file[6])
print(file)
row.append(ord(file[6]) - 97)
rows.append(row)
img = cv2.imread(os.path.join(source, file))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.resize(img,(256, 256))
test_data.append(img)
test_data = np.array(test_data, dtype="float") / 255.0
print(test_data)
print(test_data.shape)
with open("/home/hp/asl_detection/test/alpha_class.csv", "w", newline="") as f:
writer = csv.writer(f)
writer.writerows(rows)
以下是csv的几个元组:
我进一步调整了测试图像数组的形状以提供频道信息:
test_data = test_data.reshape((test_data.shape[0], img_rows, img_cols, 1))
通过从csv中获取标签,最终预测了类并计算了测试数据的准确性:
y_proba = model.predict(test_data)
y_classes = y_proba.argmax(axis=-1)
data = pd.read_csv('/home/hp/asl_detection/test/alpha_class.csv', header=None)
original_classes = data.iloc[:, 2]
original_classes = original_classes.tolist()
y_classes = y_classes.tolist()
acc = accuracy_score(original_classes, y_classes) * 100
您能找到导致如此低的测试准确性的原因吗?如果需要进一步的信息,请让我知道。
答案 0 :(得分:0)
我认为您正面临着过度拟合的问题,验证集误导了您。为了避免验证误导,它必须具有相同的测试集分布,因此,请尝试从相同的分布中生成测试集和验证集,也不要对验证数据集进行数据扩充。