我已经在keras的帮助下为二进制分类问题构建了一个NN模型,代码如下:
# create a new model
nn_model = models.Sequential()
# add input and dense layer
nn_model.add(layers.Dense(128, activation='relu', input_shape=(22,))) # 128 is the number of the hidden units and 22 is the number of features
nn_model.add(layers.Dense(16, activation='relu'))
nn_model.add(layers.Dense(16, activation='relu'))
# add a final layer
nn_model.add(layers.Dense(1, activation='sigmoid'))
# I have 3000 rows split from the training set to monitor the accuracy and loss
# compile and train the model
nn_model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['acc'])
history = nn_model.fit(partial_x_train,
partial_y_train,
epochs=20,
batch_size=512, # The batch size defines the number of samples that will be propagated through the network.
validation_data=(x_val, y_val))
这是培训日志:
Train on 42663 samples, validate on 3000 samples
Epoch 1/20
42663/42663 [==============================] - 0s 9us/step - loss: 0.2626 - acc: 0.8960 - val_loss: 0.2913 - val_acc: 0.8767
Epoch 2/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2569 - acc: 0.8976 - val_loss: 0.2625 - val_acc: 0.9007
Epoch 3/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2560 - acc: 0.8958 - val_loss: 0.2546 - val_acc: 0.8900
Epoch 4/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2538 - acc: 0.8970 - val_loss: 0.2451 - val_acc: 0.9043
Epoch 5/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2526 - acc: 0.8987 - val_loss: 0.2441 - val_acc: 0.9023
Epoch 6/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2507 - acc: 0.8997 - val_loss: 0.2825 - val_acc: 0.8820
Epoch 7/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2504 - acc: 0.8993 - val_loss: 0.2837 - val_acc: 0.8847
Epoch 8/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2507 - acc: 0.8988 - val_loss: 0.2631 - val_acc: 0.8873
Epoch 9/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2471 - acc: 0.9012 - val_loss: 0.2788 - val_acc: 0.8823
Epoch 10/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2489 - acc: 0.8997 - val_loss: 0.2414 - val_acc: 0.9010
Epoch 11/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2471 - acc: 0.9017 - val_loss: 0.2741 - val_acc: 0.8880
Epoch 12/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2458 - acc: 0.9016 - val_loss: 0.2523 - val_acc: 0.8973
Epoch 13/20
42663/42663 [==============================] - 0s 4us/step - loss: 0.2433 - acc: 0.9022 - val_loss: 0.2571 - val_acc: 0.8940
Epoch 14/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2457 - acc: 0.9012 - val_loss: 0.2567 - val_acc: 0.8973
Epoch 15/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2421 - acc: 0.9020 - val_loss: 0.2411 - val_acc: 0.8957
Epoch 16/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2434 - acc: 0.9007 - val_loss: 0.2431 - val_acc: 0.9000
Epoch 17/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2431 - acc: 0.9021 - val_loss: 0.2398 - val_acc: 0.9000
Epoch 18/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2435 - acc: 0.9018 - val_loss: 0.2919 - val_acc: 0.8787
Epoch 19/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2409 - acc: 0.9029 - val_loss: 0.2478 - val_acc: 0.8943
Epoch 20/20
42663/42663 [==============================] - 0s 5us/step - loss: 0.2426 - acc: 0.9020 - val_loss: 0.2380 - val_acc: 0.9007
我绘制了训练和验证集的准确性和损失:
如我们所见,结果不是很稳定,我选择了两个历元来重新训练所有训练集,这是新的日志:
Epoch 1/2
45663/45663 [==============================] - 0s 7us/step - loss: 0.5759 - accuracy: 0.7004
Epoch 2/2
45663/45663 [==============================] - 0s 5us/step - loss: 0.5155 - accuracy: 0.7341
我的问题是,为什么精度如此不稳定,而经过重新训练的模型只有73%的精度,如何改善模型?谢谢。
答案 0 :(得分:2)
您的验证大小为3000,而火车大小为42663,这意味着您的验证大小约为7%。您的验证准确性在.88到.90之间跳跃,这是-+ 2%的跳跃。 7%的验证数据太少而无法获得良好的统计信息,而只有7%的数据,-+ 2%的跳跃并不差。正常情况下,验证数据应为总数据的20%到25%,即火车总里程的75-25。
还请确保在进行Train-val分割之前先对数据进行混洗。
如果X
和y
是您的完整数据集,则使用
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)
这将对数据进行混洗,并给您75-25的拆分率。
答案 1 :(得分:1)
我认为验证损失在88%到90%之间波动并不完全不稳定。如果将其设置为0-100,这种“不稳定”看起来绝对很小。
import numpy as np
import matplotlib.pyplot as plt
plt.plot(np.arange(20), np.random.randint(88, 90, 20))
plt.title('Random Values Between 88 and 90')
plt.ylim(0, 100)
plt.show()
答案 2 :(得分:1)
在不知道数据集的情况下很难分辨。 当前,您仅使用密集层,根据您的问题,Rnns或卷积层可能更适合这种情况。我还能看到的是,您使用了相当大的512的批处理大小。关于批处理大小应如何有很多意见。根据我的经验,可以说批次大小超过128个可能会导致不良收敛,但这取决于很多事情。
此外,您还可以使用Dropout层为网络添加一些规范化。
还有一点,您可能希望将shuffle=True
传递给model.fit(),否则模型将始终以相同的顺序看到相同的数据,这会降低其泛化能力。
实施这些更改可能会解决“反弹损失”,我认为改组是最重要的。