Question

我正在用带有10百万个可训练参数的Resnet50模型微调5个类。数据大约有140,000个样本，其中20％用于验证。我可能会添加256个批处理，并且lr在前10个周期内从1e-5线性升温到3e-4，然后从那里进行余弦退火两次，再持续20个周期（10-30），我们将lr稳定在5e-6个周期30至40。每个类都有很多共同点，但容易区分。

所附图片说明了一切：在第10个时期之后，一切都在增加，包括损耗值，精度，精度和召回率（更不用说F1，top-1等了）。也稳步增加）。在实践中，该预测会比预期的产生很多误报，概率高达99％。

我想问一下这里到底发生了什么？

...

base_model =  ResNet50(include_top=False, weights='imagenet', input_tensor=Input(shape=(224, 224, 3)))
x = base_model.output
x = GlobalMaxPooling2D(name='feature_extract')(x)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(len(classIDs), activation="softmax", name='all_classes_hatto')(x)
classifier = Model(inputs=base_model.input, outputs=[x])

...

感谢您的帮助。史蒂夫

Answer 1

已修复。必须完成四（4）件事：

1）数据是一个问题，因此要清理干净的数据更多

2）在模型中加入更多噪音

3）激活后几乎在最后一层的Batchnorm

4）切换到学习更深入的DenseNet201

x = base_model.output
x = GlobalMaxPooling2D(name='feature_extract')(x)
# Add Gaussian noices
x = GaussianNoise(0.5)(x)
# Add batchnorm AFTER activation
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = Dense(512, activation='linear')(x)
# Add batchnorm AFTER activation w/o DropOut
x = Activation('relu')(x)
x = BatchNormalization()(x)
x = Dense(len(classIDs), activation="softmax", name='all_classes_hatto')(x)

classifier = Model(inputs=base_model.input, outputs=[x])

这行得通，现在我终于有了一个出色的模型（蓝线）。

史蒂夫

CNN火车的结果令人奇怪：VAL损失增加，而VAL准确性/精确度/召回率也增加

1 个答案: