Question

我创建了一个CNN模型来对文本数据进行分类。请帮助我解释我的结果，并告诉我为什么我的训练准确度低于验证准确度？

我总共有2619个数据，它们都是文本数据。有两个不同的类。这是我的数据集的样本。

验证集包含34个数据。其余2619个数据是训练数据。

我已经完成了RepeatedKfold交叉验证。这是我的代码。

def RepresentsInt(s):
    try: 
        int(s)
        return True
    except ValueError:
        return False

fee = []

r='start'

while r != '':

    r = input("Enter age: ")
    if RepresentsInt(r):
        age = int(r)
        if age <= 5:
            fee.append(0)
        elif age >= 6 and age <= 64:
            fee.append(50.00)
        else:
            fee.append(25.00)

total = sum(fee)
print("Total payment: ", total)

我使用了CNN。这是我的模特。

from sklearn.model_selection import RepeatedKFold 
kf = RepeatedKFold(n_splits=75, n_repeats=1, random_state= 42) 

for train_index, test_index in kf.split(X,Y):
      #print("Train:", train_index, "Validation:",test_index)
      x_train, x_test = X.iloc[train_index], X.iloc[test_index] 
      y_train, y_test = Y.iloc[train_index], Y.iloc[test_index]

这是结果。

model = Sequential()
model.add(Embedding(2900,2 , input_length=1))
model.add(Conv1D(filters=2, kernel_size=3, kernel_regularizer=l2(0.0005 ), bias_regularizer=l2(0.0005 ), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.3))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1, kernel_regularizer=l2(0.0005 ), bias_regularizer=l2(0.0005 ), activation='sigmoid'))
model.add(Dropout(0.25))
adam = optimizers.Adam(lr = 0.0005, beta_1 = 0.9, beta_2 = 0.999, epsilon = None, decay = 0.0, amsgrad = False)
model.compile(loss='binary_crossentropy', optimizer=adam, metrics=['accuracy'])
print(model.summary())
history = model.fit(x_train, y_train, epochs=300,validation_data=(x_test, y_test), batch_size=128, shuffle=False)
# Final evaluation of the model
scores = model.evaluate(x_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

请帮助我解决我的问题。谢谢。

Answer 1

您可能对模型进行过多的正则化，导致模型无法拟合数据。
一个好的开始方法是完全不进行正则化（没有Dropout，没有权重衰减，..），然后看它是否过拟合：

如果没有，则正规化是没有用的
如果过拟合，则一点一点地添加正则化，从小的落差/权重衰减开始，如果继续过拟合，则将其降低

Moroever，请勿将Dropout放置为最后一层，也不要连续放置两个Dropout层。

Answer 2

您的训练精度可能会低于验证精度，这可能是由于使用了辍学：在训练过程中它会“关闭”某些神经元以防止过度拟合。验证期间，辍学功能处于关闭状态，因此您的网络会使用其所有神经元，从而（在特定情况下）做出更准确的预测。

总的来说，我同意Thibault Bacqueyrisses的建议，并且想补充一点，在批量标准化之前放置辍学通常也是一个不好的做法（无论如何，这与这种情况无关）。

训练精度小于验证精度

2 个答案: